Category Archives: Biomedical Engineering

Clinical Decision Systems




Clinical Decision Making


Clinical Decision Systems History

Pirkko NykдNen

VTT Information Technology


Clinical Application Areas and Types of Systems


Requirements and Critical Issues for Clinical Decision Systems Development

Niilo Saranummi


Evaluation of Clinical Decision Systems

VTT Information Technology




Clinical decision systems are information systems that support and assist health care professionals in clinical decision making tasks like diagnosis, therapy planning and monitoring. Their purpose is not to replace or substitute for human abilities or skills. The value of clinical decision systems depends on their role and contribution to the patient care process. Their role may be to act as a reference book of relevant medical knowledge or similar patient cases, or as a watchdog to screen routines or detect deviations, or they may remind the user of extreme possibilities or forecast outcomes of different alternatives. Addi­tionally, these systems may produce information from data by interpretation, work as tutoring or guidance systems, and represent a more experienced consultant for a less experienced user.

Clinical Decision Making

The objective of health care is to understand, prevent and treat human diseases. In medical practice a clinician is involved in a complex cognitive task where he applies different kinds of knowledge on several cognitive and epistemic levels. A clinician develops, through education and experience, an understanding of fundamental anatomic, physiologic and pathologic processes, an understanding of how to apply diagnostic methods and procedures, and an understanding of the effects of drugs and treatments on patients. In a decision making situation a clinician has to understand the clinical state of the patient, the relevant processes going on in the patient, and he must have an understanding of the purpose of his intervention and of the possible actions he can initiate.

The central goal of medical research is to improve clinical practice. This can be achieved in many different ways. The focus may be on improving the theoretical understanding of the physiologic and pathologic processes on a high level of abstraction, or on concentrating in the study of diagnostic methods, or on evaluating new therapies, or on developing preventive measures [Pedersen et al. 1993]. In some cases medical research is close to basic natural science, in other situations it is close to applied engineering research.

A medical decision-maker acquires and manages several different forms of knowledge. Basically two types of knowledge are involved in decision-making [van Bemmel and Musen 1997, Pedersen et al. 1993]: scientific or formal knowledge and experiential knowledge. Scientific knowledge deals with understanding the scientific principles and relationships between pathophysiological conditions and disease symptoms. This type of knowledge is found in scientific literature and articles. Scientific knowledge helps to explain and justify empirical phenomena and it tells to what extent simplifications and approximations make sense in real life situations. Experiential knowledge originates from well-documented patient cases and validated guidelines. Experiential knowledge helps the clinician to recognize a disease based on his experience. Experiential knowledge in decision making needs to be justified and validated within the framework of scientific knowledge, and when this is done, the most efficient computational models can be achieved. Scientific and experiential knowledge can be characterized as deep and shallow, or surface knowledge, respectively. The terms tacit and explicit knowledge provide another viewpoint on knowledge. Tacit knowledge in a sense is a descriptor for the skills of a clinician. Often during the course of his professional life, he may have operationalized his experiential and scientific knowledge to a degree that he can no longer explain “what he knows” explicitly.

Medicine is a complex area for clinical decision systems. The complexity arises from many sources. Medicine is not a formal science; it has an empirical basis. We have shallow or limited understanding of many of the disease processes and also of the medical decision making. The challenge of a physician is to be up-to-date on current “best practices” in medicine. Unfortunately a large part of medical knowledge is based on experience rather than on hard facts. “Evidence based medicine” as a large-scale initiative aims to improve that ratio [Cochrane].

In medical decision making, scientific and experiential knowledge are interwoven. A normal situation is that a clinician knows enough and has enough reliable data available to make a decision in that situation. Medical expertise, or medical competence, is situational and it is used in a data-intensive environment. The kind of knowledge which is relevant for decision making is huge, even in restricted subspecialities of medicine [Miller 1988]. Diagnosis is often said to be the main task of the physician and a lot of attempts have been made to model the diagnostic process. However, diagnosis is not the only decision making task of a medical professional. He is also involved in decisions that concern selection of treatments, monitoring and follow up, interpretation of signs and symptoms, prediction of consequences and effects, selection of drugs and procedures, etc. Each clinician applies his own pathway to decision making depending on the case at hand, taking into account the patient and external factors like the urgency of the situation, difficulty of the case, role that the physician assumes and possible choices and actions that can be taken.

In clinical decision making, medical knowledge is used to interpret patient data. This raises the issue of what patient data is available. The problem with patient data is that it is highly context sensitive. A major part of patient data collected during clinical episodes is unstructured. Referrals and discharge letters are mainly unstructured summaries accompanied by structured elements like laboratory findings. Each clinician has his own way of embedding the context into these; and other clinicians need to know this before they can truly interpret each other’s notes. Several ways have been developed to summarize this information. These include classifiers like ICD-10, SNOMED, Read codes and DRGs. Much more ambitious are the two major projects which were launched on both sides of the Atlantic some years ago to develop “translators” for communication between clinicians, the Unified Medical Language System (UMLS) and Generalized Architecture for Languages, Encyclopedias and Nomenclatures in Medicine (GALEN).

Clinical Decision Systems History

The first approaches to providing computer support for medical decision making were based on Bayesian statistics. Programs for diagnostic problems showed that impressive diagnostic accuracy could be achieved if the computer programs were supported by reliable data [Shortliffe, 1988]. However, the demand for independence between variables in these approaches was found to be too restrictive in medicine and a lot of problems were related to the accessibility of reliable data. It was also noticed that to be able to tackle the information problem, simultaneous perspectives were needed from other disciplines such as, decision theory, linguistics, mathematical modeling, and cognitive psychology [Gremy, 1988].

Research with artificial intelligence in medicine started in the 1960s based on the understanding that when computer systems are used to emulate medical decision making steps, they need relevant medical knowledge and this knowledge has to be represented in a context-sensitive way. Much of this knowledge must be extracted from the experts in the field; the statistical data approach is not enough. Consequently, symbolic reasoning methods are needed. The early work resulted in knowledge-based consultations systems like MYCIN [Shortlife, 1976] and INTERNIST [Miller et al., 1982] among others.

The focus of this early work was in the development of methodologies and tools to support modeling of a domain, knowledge acquisition including machine learning approaches, modeling of reasoning strategies, and knowledge representations. Interests in reasoning strategies were focused on causal, tem­poral, heuristic, strategic, and anatomic reasoning and reasoning from multiple sources of knowledge. Issues of validation of knowledge and evaluation of prototype systems were also raised. The term expert system was introduced and much of the research was focused on how to represent expert knowledge, how to acquire it and how to use it in problem solving. Certainty factors and other heuristic scoring measures were developed to manage probabilities and weights of evidence. Also issues related to users and their environments were tackled, like human-computer interaction and interfaces to databases and database management systems, integration of natural language processing with knowledge-based systems, and integration of medical decision analysis with artificial intelligence techniques [Buchanan and Short- liffe, 1984, Miller, 1988, Shortliffe et al., 1990].

In the late 1980s the rapid evolution both in AI hardware and software caused a big change. The expert system shells (i. e., tools to develop expert systems where the inference engine is separated from the knowledge base) were introduced and widely used. Cheap and effective microcomputers as well as dedicated professional workstations for knowledge engineering offered integrated and interactive envi­ronments for expert systems development. Progress in computer graphics, multimedia, and imaging provided further support for this.

The focus shifted to attempts to find general solutions to problems through modeling approaches. The final result of the development of a knowledge-based system was seen to be a model of expertise, which in turn is implemented in the knowledge base. These models of expertise incorporated assumptions about knowledge representations, knowledge base contents, problem-solving methods and application tasks [Musen, 1989]. Knowledge level modeling was widely recognized to be the major way to generalize and structure domain and task knowledge. The knowledge level model describes the problem-solving process in terms of general abstract conceptualizations without reference to the implementations [Newell, 1982].

Moving from the knowledge level to the knowledge-use level, however, posed a problem [Newell, 1982, Steels 1990]. Several frameworks were suggested for this: Heuristic classification with the focus on the inference structures [Clancey, 1985], distinctions between deep and shallow knowledge at the knowledge level [Chandrasekaran and Mittal, 1982], a problem-solving method approach focusing on how the problem might be solved at the knowledge-use level [Bennet, 1985] and finally the generic task approach where the analysis of expertise is performed in terms of the tasks and ordering of the tasks [Chandraseka­ran, 1986].

From the process point of view, medical reasoning may be modeled by fundamental medical tasks, e. g., diagnosis, therapy planning and monitoring. An AI-based clinical decision system requires formu­lation of an inference model (i. e., the reasoning process) which exploits an appropriate ontology (i. e., the medical process). An architecture then provides a conceptual description of these components at the knowledge and computational level [Stefanelli, 1993].

In spite of that progress, AI based systems did not gain widespread use. What are the reasons for this? Most AI systems were developed within a narrow framework for a few dedicated users and for well — defined tasks and restricted domains. Also, development projects were mostly very technology — or meth­odology-oriented. Knowledge acquisition was also a major barrier. It has been realized that expertise cannot be elicited by interviews and cannot be completely expressed as formal representations as experts cannot explicate their tacit knowledge.

Today, decision support systems are no longer knowledge-based systems, but rather, knowledge-based approaches have been integrated with traditional information technologies resulting in decision support systems. Also the connectionist approaches of classical AI, e. g., artificial neural networks, have gained more use and there exist many well-functioning neural network applications in restricted, data-rich problem areas. The Internet is a potential platform for the future of AI-based approaches because the flexibility for representations and programs which was typical for AI approaches can be found on the Internet. The Internet also allows new possibilities for distribution and access to clinical decision support.

Clinical Application Areas and Types of Systems

The first application areas were diagnosis and treatment planning. The scope broadened soon to systems addressing tasks of monitoring and follow-up of patients, diseases, and treatments, especially in the domains of chronic diseases and intensive care. Applications have tackled medical, managerial, admin­istrative, and financial decisions in health care. Most clinical decision systems were planned for highly specialized hospital environments, and very rarely were problems of community care, primary care, or occupational health examined. Most successful clinical decision systems exist in the following situations (van Bemmel and Musen 1997):

When systems offer support for data validation like supporting systems, which are interfaced with laboratory information systems. These kinds of systems help to increase data reliability, to admin­ister data processing and data flow activities, to transmit data, and to manage quality control.

When systems support data reduction, i. e., they reduce, transform, and present data in such a way that it is easier for the health care professional to interpret data and screen the massive data flow. Typical areas for this type of system are intensive care and operating rooms.

When systems support data acquisition like in medical imaging where data is processed to give the medical professional the best possible view.

Most stand-alone systems work as consulting systems in such a way that they require direct interaction from the user to start the dialogue. Interfaced systems are connected with information systems like laboratory systems or electronic patient records and they retrieve data from that system for their input. Integrated systems are directly connected or embedded into medical devices like diagnostic ECG recorders.

Requirements and Critical Issues for Clinical Decision Systems Development

Clinical decision systems can be successful if they are planned to support human professionals in tasks where support is needed and where support can be offered with information technology. For instance, information enhancement is needed in many decision making situations as a precursor for the next decision step, data reduction is necessary to interpret or identify significant findings, and data validation is needed to have reliable data.

Representational issues, both semantic and syntactic, have to be solved in such a way that no misun­derstandings or misinterpretations are met. We have to be conscious also of the system’s brittleness, because the model is an abstraction and approximation of the reality, and transferring a model from one environment to another may result in an invalid model and implementation.

In developing models and specifying the system, users need to be involved. On the other hand, they need tools which enable them to analyze, structure, and visualize domain knowledge without strong support by technology personnel, i. e., tools that support modeling and reasoning strategies and system architectures for different tasks, domains, and environments [Brender et al., 1993].

Clinical decision systems must be acceptable to the users, serve real users’ needs, be cost-effective and their effects and impacts must be known. In the early expert systems, explanation facilities were important for acceptability by users. These explanation facilities served both educational purposes and through them the system’s output was made clear for the user. Critiquing was more acceptable to the users than the consulting mode [Miller 1988], because the critiquing system does not give advice to the user, but takes the user’s proposal and discusses its pros and cons in relation to other possibilities.

Medical textbooks, journals, and other reference materials are the stores of experiential and scientific knowledge. Accessing the right source at the right time is the question. Care guidelines and protocols are an approach to compiling medical knowledge into an operational form leading to protocol-based care. The acts needed in handling the problem of a patient can be seen as an instantiation of a template containing that “care package”. Clinical protocols are used routinely in clinical research to determine the efficacy and effectiveness of new drugs and medical procedures. They are also commonly used in cancer therapy and in other medical specialities.

However, while medical knowledge is universal, clinical practice is local. Therefore guidelines and protocols need to be locally adapted. Furthermore, as each patient is an individual, protocols cannot be applied in every detail. Instead the user must have the freedom to apply a template according to the needs of that particular case. Consequently what started as protocols and decision trees in the knowledge-based system era of the 1980s has gradually changed so that today they are called clinical guidelines and clinical pathways [Prestige]. Consequently, decision support is gradually becoming embedded into clinical infor­mation systems compared to the stand alone decision systems of the past.

Evaluation of Clinical Decision Systems

Evaluation and validation are necessary tasks during development and before accepting the systems into use [Clarke et al., 1994]. They are means to control the development and guarantee the quality of the results; thus it supports development of good clinical practice. The users need guidance and support to

Perform evaluations and the developer or the vendor of the system must provide the user with the

Necessary material needed for evaluation [Nykanen, 1990]. The evaluations of clinical decision systems performed and reported in the literature show that a global, generally accepted methodology for evalu­ation does not exist. The reported results are subjective and evaluations have been performed without generally accepted objectives and standards.

In a study [Wyatt, 1987] 14 medical decision support systems were evaluated according to a three — stage procedure where in stage one, the main consideration was the system’s clinical role, stage two considered if the system had been tested with a proper sample with predefined goals for accuracy and utility, and stage three considered if the performance of the doctor was tested when he was aided or not aided by the system. The result was that only three systems had their clinical role defined, seven systems had paper tests, and only one system had been subjected to a field test. Medical experts were used as the “Gold Standard,” in most cases only one expert had been used. From these 14 systems, only two have ever been used in practice. Another study [Lundsgaarde, 1987] concluded that only about 10% of the medical decision support systems had reached laboratory testing. A further study of 64 medical decision support systems [Pothoff et al., 1988] concluded that most systems were still in their developmental stage and only eight systems were developed to the stage to enter routine clinical use.

Most of the decision support systems are planned to work as supporting devices for the clinicians and thus they have to be integrated into the user’s environment. Evaluation of their fitness for that has been neglected in reported evaluations with some exceptions. General practitioners used the SPHINX system [Botti et al., 1989] for six months. Though this system was used by the practitioners via the French MINITEL-network, the legal issues were not considered in the studies. It should, however, be of particular importance for a system operating on a network [Clarke et al., 1994].

In a study by [Nykanen et al. 1998], some health telematics projects were studied from the validation methodology perspective. Results show that importance on evaluation and validation has been under­stood only very recently. Too little resources are usually reserved for evaluation and usable evaluation methodologies still do not exist as well as experience and skills on how to apply them. Validation and evaluation are not yet fully integrated with the development process so that evaluation results could be used as feedback for the systems’ development.

In summary, organizational and human aspects in a health care environment require more attention. In the user environment, evaluation should at least consider if the system helps the user to improve the outcome of his work in the organizational setting and if it is usable and cost-effective [Friedman and Wyatt 1997, Brender 1997].

It is difficult to foresee all the legal implications of decision support systems on clinical practice. Health professionals may even think that using computer applications is less harmful than not using them [Hafner et al. 1989]. Scientists have tried to stimulate a discussion of legal responsibilities of clinicians that have not been giving state-of-the-art treatment because they are not able to find relevant information. This is becoming one of the leading arguments in favor of decision support systems in health care in the future [Forstrom and Rigby 1998]. A general interpretation of the liability principle has also been that decisions suggested or supported by a computer system are ultimately the responsibility of the doctor who puts them into effect.

A project in the Telematics Applications Programme of the European Union (VATAM, Validation of Telematic Applications in Medicine) develops guidelines for validation. These guidelines will include a repository of existing methodologies, a glossary of terms, and a library of tools and checklists [Talmon et al. 1998]. The purpose of the guidelines is to report on experiences, to inform on possible approaches, and to assist in selecting an experimental study design, methodology, methods, and metrics for evaluation. The guidelines are targeted for the various stakeholders, the different types of health telematic systems and they cover the system life cycle. Information and descriptions of various evaluation approaches and methodologies can also be found in [Friedman and Wyatt 1997, Brender 1997].


First, clinical decision systems applied statistical methods and these systems were developed for restricted data reduction problems in data rich domains. Most problems at this phase were related to acquisition of required valid data. Next, a generation of clinical support systems were developed using artificial intelligence approaches and methods. These systems aimed at emulating human intelligent behavior. However, modeling and representation of human intelligence and medical reasoning proved to be more difficult than expected and the resulting systems were restricted in performance and brittle for their knowledge and domain. The biggest problems with artificial intelligence-based systems were related to knowledge acquisition because medical expertise could not be elicited through interviews as experts cannot explicate their tacit knowledge. Current clinical decision systems have been integrated with traditional information technologies.

Integration of decision support systems with information technologies and with the developing health care information infrastructure offers real challenges for clinical decision systems in the future. Systems can be integrated, interfaced or embedded with other information technology systems and products and data acquisition can be performed via networks and support with the best modality offered via networks, where and when needed. The future users of decision support systems include not only health care professionals and physicians, but also patients and citizens at home and in need of support.


Bennet J S. 1985. ROGET: A knowledge-based system for acquiring the conceptual structure of a diag­nostic expert system. J. Automatic Reasoning, 1:49-74.

Botti G, Joubert M, Fieschi D, Proudon H, and Fieschi M. 1989. Experimental use of the medical expert system SPHINX by general practitioners: Results and analysis. In: Barber B, Cao D, Qin D and Wagner G (Eds.), Proc. Sixth Conf. Med. Inform. (MEDINFO 1989), p. 67-71, North-Holland, Amsterdam.

Brender J, Talmon J, Nykдnen P, O’Moore R, Drosos P and McNair P. 1993. KAVAS-2: Knowledge Acquisition, Visualisation and Assessment System. In: European Conference on Artificial Intelligence in Medicine, Andreassen S, Engelbrecht R and Wyatt J, (Eds.) Munich, 1993, pp. 417-420. IOS Press, Amsterdam.

Brender J. 1997. Methodology for assessment of medical IT-based systems in an organisational context. Technology and Informatics 42, IOS Press, Amsterdam.

Buchanan B G and Shortliffe E H (Eds.). 1984. Rule-based expert systems: The MYCIN experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading, MA.

Chandrasekaran B and Mittal S. 1982. Deep versus compiled knowledge approaches to diagnostic problem solving. In Proc. Amer. Assoc. Artif Intell. ’82, pp. 349-354.

Chandrasekaran B. 1986. Generic tasks in knowledge-based reasoning: High building blocks for expert systems. IEEE Expert, Fall: 23-30.

Clancey W J. 1985. Heuristic classification. Art. Intell. 27: 289-350.

Clarke K, O’Moore R, Smeets R, Talmon J, Brender J, McNair P, Nykдnen P, Grimson J and Barber B. 1994. A methodology for evaluation of knowledge based systems in medicine. Int. J. Art. Intell. Med., 6(2):107-122.

Cochrane: The Cochrane Collaboration. Http://hiru. mcmaster. ca/cochrane/default. htm

Friedman CP and Wyatt J, Evaluation methods in medical informatics. Springer-Verlag, New York, 1997

Forsstrцm JF and Rigby M, Addressing the Quality of the IT Tool — Assessing the Quality of Medical Software and Information Services. Int. J. Med. Inform., 1998 (accepted for publication).

Galen: Generalised Architecture for Languages, Encyclopaedias and Nomenclatures in medicine, a European Union supported project in the Telematics Applications Program, sector Health. Http://www. cs. man. ac. uk/mig/giu

Gremy F. 1988. The role of informatics in medical research. In: Data, information and knowledge in medicine, developments in medical informatics in historical perspective. Meth. Inf. Med. Special Issue, pp. 305-309.

Hafner AW, Filipowicz AB, Whitely WP. 1989. Computers in medicine: Liability issues for physicians. Int. J. Clin. Monit. Comput, 6:185-94.

Lundsgaarde H P. 1987. Evaluating medical expert systems. Social Science and Medicine, 24 (10): 805-819.

Miller R A, Pople H E and Myers J D. 1982. Internist-1: An experimental computer-based diagnostic consultant for general internal medicine. N. Eng. J. Med. 307: 468-476.

Miller P L (Ed.). 1988. Selected Topics in Medical Artificial Intelligence. Springer-Verlag, New York.

Musen M A. 1989. Generation of model-based knowledge acquisition tools for clinical trial advice systems. PhD Dissertation, Stanford University, USA.

Newell A. 1982. The knowledge level. Art. Intell., 18:87-127.

Nykдnen P (Ed.). 1990. Issues in evaluation of computer-based support to clinical decision making. Report of the SYDPOL-5 Working Group, Oslo University Press, Reserach Report 127, Oslo.

Nykдnen P, Enning J, Talmon J, Hoyer D, Sanz F, Thayer C, Roine R, Vissers M and Eurling F. 1998. Inventory of validation approaches in selected health telematics projects. Int. J. Med. Inform., (accepted for publication).

Pedersen S A, Jensen P F and Nykдnen P. 1993. Epistemological analysis of deep medical knowledge in selected medical domain. Public Report of the KAVAS-2 (A2019) Project. Commission of the European Communities, Telematic systems in health care.

Pothoff P, Rothemund M, Schwebel D, Engelbrecht R and van Eimeren W. 1988. Expert systems in medicine. Possible future effects. Int. J. Technology Assessment in Health Care, 4:121-133.

Prestige: Patient record supporting telematics and guidelines, a European Union supported project in the Telematics Applications Program, sector Health Http://www. rbh. thames. nhs. uk/rbh/itdept/r&d/ projects/prestige. htm

Shortliffe E H. 1976. Computer-based consultations: MYCIN. Elsevier, New York.

Shortliffe E H. 1988. Editorial: Medical knowledge and medical decision making. In: Data, information and knowledge in medicine, developments in medical informatics in historical perspective. Meth. Inf. Med. Special Issue 1988, 209-218.

Shortliffe E H, Perreault L E, Wiederhold G and Fagan L (Ed.). 1990. Medical Informatics: Computer applications in health care. Addison-Wesley, Reading, MA.

Steels L. 1990. Components of expertise. AI Magazine, Summer 12: 28-49.

Stefanelli M. 1993. European research efforts in medical knowledge-based systems. Int. J. Art. Intell. Med. 5: 107-124.

Talmon J, Enning J, Castadena G, Eurlings F, Hoyer D, Nykдnen P, Sanz F, Thayer C, Vissers M. 1998. The VATAM guidelines. Int. J. Med. Inform., (accepted for publication) Guidelines accessible via web-server: Http://www-vatam. unimaas. nl UMLS: Unified Medical Language System, Http://nlm. nih. gov/research/umls/UMLSDOC. html Van Bemmel JH and Musen M (Eds.) 1997. Handbook of Medical Informatics. Bohn Stafleu Van Loghum, Houten, the Netherlands.

Wyatt J. 1987. The evaluation of clinical decision support systems: A discussion of the methodology used in the ACORN project. In: Proc. Art. Intell. Med., Fox J, Fieschi M and Engelbrecht R, (Eds.) Springer-Verlag, Berlin Heidelberg, pp. 15-24.

Summers, R., Carson, E. R., Cramp, D. “Expert Systems: Methods and Tools.” The Biomedical Engineering Handbook: Second Edition.

Ed. Joseph D. Bronzino

Boca Raton: CRC Press LLC, 2000

Artificial Neural Networks: Definitions, Methods, Applications



Daniel A. Zahner

Rutgers University 182.3

Evangelia Micheli — 182.4


Rutgers University

Definitions Training Algorithms

Backpropagation Algorithm • The ALOPEX Algorithm VLSI Applications of Neural Networks Applications in Biomedical Engineering Expert Systems and Neural Networks • Applications in Mammography • Chromosome and Genetic Sequences Classification

The potential of achieving a great deal of processing power by wiring together a large number of very simple and somewhat primitive devices has captured the imagination of scientists and engineers for many years. In recent years, the possibility of implementing such systems by means of electro-optical devices and in very large scale integrations has resulted in increased research activities.

Artificial neural networks (ANNs) or simply neural networks (NNs) are made of interconnected devices called neurons (also called neurodes, nodes, neural units, or simply units). Loosely inspired by the makeup of the nervous system, these interconnected devices look at patterns of data and learn to classify them. NNs have been used in a wide variety of signal-processing and pattern-recognition applications and have been applied successfully in such diverse fields as speech processing [1—4], handwritten character recog­nition [5-7], time-series prediction [8-9], data compression [10], feature extraction [11], and pattern recognition in general [12]. Their attractiveness lies in the relative simplicity with which the networks can be designed for a specific problem along with their ability to perform nonlinear data processing.

As the neuron is the building block of a brain, a neural unit is the building block of a neural network. Although the two are far from being the same, or from performing the same functions, they still possess similarities that are remarkably important. NNs consist of a large number of interconnected units that give them the ability to process information in a highly parallel way. The brain as well is a massively parallel machine, as it has been long recognized. As each of the 1011 neurons of the human brain integrates incoming information from all other neurons directly or indirectly connected to it, an artificial neuron sums all inputs to it an creates an output that is carrying information to other neurons. The connection from one neuron’s dendrites or cell body to another neuron’s processes is called a synapse. The strength by which two neurons are influencing each other is called a synaptic weight. In an NN, all neurons are connected to all other neurons by synaptic weights that can have seemingly arbitrary values, but in reality, these weights show the effect of a stimulus on the neural network and the ability or lack of it to recognize that stimulus.

In the biological brain, two types of processes exist: static and dynamic. Static brain conditions are those which do not involve any memory processing, while dynamic processes involve memory processing and changes through time. Similarly, NNs can be distinguished into static and dynamic: the former being such that do not involve any previous memory and only depend on current inputs and the latter having memory and being able to be described by differential equations that express changes in the dynamics of the system through time.

All NNs have certain architectures, and all consist of several layers of neuronal arrangements. The most widely used architecture is that of the perceptron first described in 1958 by Rosenblatt [13]. In the sections that follow we will build on this architecture, but not necessarily on the original assumptions of Rosenblatt.

Since there are many names in the literature that express the same thing and usually create a lot of confusion for the reader, we will define the terms to be used and use them throughout the chapter. Terminology is a big concern for those involved in the field and for organizations such as IEEE. A standards committee has been formed to address issues such as nomenclature and paradigms. In this chapter, whenever possible, we will try to conform to the terms and definitions already in existence.

Some methods for training and testing of NNs will be described in detail, although many others will be left out due to lack of space, but references will be provided for the interested reader. A small number of applications will be given as examples, since many more are discussed in another chapter of this Handbook and it would be redundant to repeat them here.


Neural nets (NNs) go by many other names, such as connectionist models, neuromorphic systems, and parallel distributed systems, as well as artificial NNs to distinguish them from the biologic ones. They contain many densely interconnected elements called neurons or nodes, which are nothing more than computational elements nonlinear in nature. A single node acts like an integrator of its weighted inputs. Once the result is found, it is passed to other nodes via connections that are called synapses. Each node is characterized by a parameter that is call threshold or offset and by the kind of nonlinearity through which the sum of all the inputs is passed. Typical nonlinearities are the hardlimiter, the ramp (threshold logic element), and the widely used sigmoid.

I N — ( wN V

подпись: i n — ( wn vThe simplest NN is the single-layer perceptron [13, 14], and it is a simple net that can decide whether an input belongs to one of two possible classes. Figure 182.1 is a schematic representation of a perceptron the output of which is passed through a nonlinearity called an activation function. This activation function is of different types, the most popular being a sigmoidal logistic function.

Artificial Neural Networks: Definitions, Methods, Applications

I i I 2


I 4

Artificial Neural Networks: Definitions, Methods, Applications

FIGURE 182.2 Typical activation functions.

Figure 182.2 Is a schematic representation of some activation functions such as the handlimiter (or step), the threshold logic (or ramp), a linear, and a sigmoid. The neuron of Fig. 182.1 Receives many inputs I each weighted by a weight W, (i = 1,2,…, N). These inputs are then summed. The sum is then passed through the activation function f and an output y is calculated only if a certain threshold is exceeded. Complex artificial neurons may include temporal dependencies and more complex mathemat­ical operations than summation [15]. While each node has a simple function, their combined behavior becomes remarkably complex when organized in a highly parallel manner.

NNs are specified by their processing element characteristics, the network topology, and the training or learning rules they follow in order to adapt the weights W. Network topology falls into two broad classes: feedforward (nonrecursive) and feedback (recursive) [16]. Nonrecursive NNs offer the advantage of simplicity of implementation and analysis. For static mappings, a nonrecursive network is all one needs to specify any static condition. Adding feedback expands the network’s range of behavior, since now its output depends on both the current input and network states. But one has to pay a price: longer times for teaching the NN to recognize its inputs.

Obviously, the NN of Fig. 182.1 Is quite simple and inadequate is solving real problems. A multilayer perceptron (MLP) is the next choice. A number of inputs are now connected to a number of nodes at a second layer called the hidden layer. The outputs of the second layer may connect to a third layer and so on, until they connect to the output layer. In this representation, every input is connected to every node in the next layer, and the outputs of one hidden layer are connected to the nodes of the next hidden layer, and so on. Figure 182.3 Depicts a three-layer feedforward network, the simplest MLP.

The optimal number of hidden neurons needed to perform arbitrary mapping is yet to be determined. Methods used in practice are mainly intuitive determination or are found by trial and error. Recent work

Artificial Neural Networks: Definitions, Methods, Applications

Has attempted to formulate bounds for the number of hidden nodes needed [17]. Mathematical derivation proves that a bound exists on the number of hidden nodes m needed to map a fc-element input set. The formulation is that m = k — 1 is an upper bound. These results are consistent with the optimal number of hidden neurons, determined empirically in Kung et al. [18]. In other works, the number of hidden nodes necessary has been proposed to be a function of the number of the separable regions needed as well as the dimension of the input vector [19].

Artificial NNs usually operate in one of two modes. Initially, there exists a training phase where the interconnection strengths are adjusted until the network has a desired output. Only after training does the network become operational, i. e., capable of performing the task it was designed and trained to do. The training phase can be either supervised or unsupervised. In supervised learning, there exists infor­mation about the correct or desired output for each input training pattern presented [20]. The original perceptron and backpropagation are examples of supervised learning. In this type of learning, the NN is trained on a training set consisting of vector pairs. One of these vectors is used as input to the network; the other is used as the desired or target output. During training, the weights of the NN are adjusted in such a way as to minimize the error between the target and the computed output of the network. This process might take a large number of iterations to converge, especially because some training algorithms (such as backpropagation) might converge to local minima instead of the global one. If the training process is successful, the network is capable of performing the desired mapping.

In unsupervised learning, no a priori information exists, and training is based only on the properties of the patterns. Sometimes it is also called self-organization [20]. Training depends on statistical regular­ities that the network extracts from the training set and represents them as weight values. Applications of unsupervised learning have been limited. However, hybrid systems of unsupervised learning combined with other techniques produce useful results [21-23]. Unsupervised learning is highly dependent on the training data, and information about the proper classification is often lacking [21]. For this reason, most NN training is supervised.

Training Algorithms

After McCulloch and Pitts [24] demosntrated, in 1943, the computational power of neuron-like networks, much effort was given to developing networks that could learn. In 1949, Donald Hebb proposed the strengthening of connections between presynaptic and postsynaptic units when both were active simul­taneously [25]. This idea of modifying the connection weights, as a method of learning, is present in most learning models used today. The next major advancement in neural networks was by Frank Rosen­blatt [13, 14]. In 1960, Windrow and Hoff proposed a model, called the adaptive linear element (ADA — LINE), which learns by modifying variable connection strengths, minimizing the square of the error in successive iterations [26]. This error-connection scheme is now known as the least mean square (LMS) algorithm, and it has found widespread use in digital signal processing.

There was great interest in NN computation until Minsky and Papert published a book in 1969 criticizing the perceptron. This book contained a mathematical analysis of perceptron-like networks, pointing out many of their limitations. It was shown that the single-layer perceptron was incapable of performing the XOR mapping. The single-layer perceptron was severely limited in its capabilities. For linear activation functions, multilayer networks were no different from single-layer models. Minsky and Papert pointed out that multilayer networks with nonlinear activation functions could perform complex mappings. However, the lack of any training algorithms for multiple-layer networks made their use impossible. It was not until the discovery of multilayer learning algorithms that interest in NNs resurfaced. The most widely used training algorithm is the back-propagation algorithm, as already mentioned.

Another algorithm used for multilayer perceptron training is the ALOPEX algorithm. ALOPEX was originally used for visual receptive field mapping by Tzanakou and Harth in 1973 [28-30] and has since been applied to a wide variety of optimization problems. These two algorithms are described in detail in the next subsections.

Backpropagation Algorithm

The backpropagation algorithm is a learning scheme where the error is backpropagated layer by layer and used to update the weights. The algorithm is a gradient-descent method that minimizes the error between the desired outputs and the actual outputs calculated by the MLP. Let


Ep □1 □( □ Y0 (182.1)

2 i=1

Be the error associated with template p. N is the number of output neurons in the MLP, Ti is the target or desired output for neuron i, and Y{ is the output of neuron i calculated by the MLP. Let E = ‘LEp be

The total measure of error. The gradient-descent method updates an arbitrary weight w in the network

By the following rule:

W( □ 10H w(0+[I]w[(tl (182.2)

Where Bw[(Qn □□ ^E n (182.3)

Where n denotes the iteration number, and n is a scaling constant. Thus the gradient-descent method requires the calculation of the derivatives 5E/[5w(n)] for each weight w in the network. For an arbitrary hidden-layer neuron, its output H. is a nonlinear function f of the weighted sum of all its inputs (net):

HJ □ f (etj□ (182.4)

Where f is the activation function. The most commonly used activation function is the sigmoid function given by

FOO3^ • (182.5)

Using the chain rule, we can write

And since


подпись: (182.7)Net. □ I I w..I. j I—I j i

We have

подпись: we haveJ=1


—1 □ It ij

□w.. dnet.

;1 1

DE m m dE dnetk dH —

ml k-—1-

Dnet. dnet DH. Dnet.

1 k m k 1 1


□ —f (t{

Dnet. dH. 1





□ □DT, — Y. □.


In summary, then, first, the output Y, for all the neurons in the network is calculated. The error derivative needed for the gradient-descent update rule of Eq. (182.2) is calculated from

DE = dE dnet dw dnet dw




□ w}k.





‘et, V






K Q1 k

Assuming f to be the sigmoid function of Eq. (182.5), then


Equation (182.14) gives the unique relation that allows the backpropagation of the error to all hidden layers. For the output layer,

Artificial Neural Networks: Definitions, Methods, Applications



Recalling that




It follows the


Artificial Neural Networks: Definitions, Methods, Applications
Artificial Neural Networks: Definitions, Methods, Applications




If j is a hidden neuron, then the error derivative is backpropagated by using Eqs. (182.14) and (182.15). Substituting, we get


Y Q □ Y □□d^wt. (182.20)

□net. unet,

J k m k

Finally, the weights are updated as in Eq. (182.2).

There are many modifications to the basic algorithm that have been proposed to speed the convergence of the system. Convergence is defined as a reduction in the overall error between a minimum threshold. It is the point at which the network is said to be fully trained. One method [31] used is the inclusion of a momentum term in the updated equation such that

WQ □ Ј□ wQ*n^n □□□wQfl (182.21)

Cw (□

N is the learning rate and is taken to be 0.25. a is a constant momentum term that determines the effect of past weight changes on the direction of current weight movements.

Another approach used to speed the convergence of backpropagation is the introduction of random noise [32]. It has been shown that while inaccuracies resulting from digital quantization are detrimental to the algorithm’s convergence, analog perturbations actually help improve convergence time.

One of these variations is the modification by Fahlman [33], called the quickprop, that uses second- derivative information without calculating the hessian needed in the straight backpropagation algorithm. It requires saving a copy of the previous gradient vector, as well as the previous weight change. Compu­tation of the weight changes use only information associated with the weight being updated:

□w„ Q[

□w □

подпись: □w □(182.22)

Where Vw^n) is the gradient vector component associated with the weight w. at iteration n. This algorithm assumes that the error surface is parabolic, concave upward around the minimum, and that the slope change of the weight Vwj(n) is independent of all other changes in weights. There are obviously problems with these assumptions, but Fahlman suggests a “maximum growth factor” in order to limit the rate of increase of the step size, namely, that if Aw^n) > ^Aw^n — 1), then Aw^n) = p-Aw^n — 1). Fahlman also used a hyperbolic arctangent function to the output error associated with each neuron in the output layer. This function is almost linear for small errors, but it blows up for large positive or large negative errors. Quickprop is an attempt to reduce the number of iterations needed by straight back­propagation, and it succeeded doing so by a factor of 5, but this factor is problem-dependent. This method also requires several trials before the parameters are set to acceptable values.

Backpropagation has achieved widespread use as a training algorithm for NNs. Its ability to train multilayer networks has led to resurgence of interest in the field. Backpropagation has been used
Successfully in applications as adaptive control of dynamic systems and in many general NN applications. Dynamic systems require monitoring of time in ways that monitor the past. In fact, the biological brain performs in an admirable way just because it has access to and uses values of different variables from previous instances. Backpropagation through time is another extension of the original algorithm pro­posed by Werbos in 1990 [34] and has been applied previously in the “truck backer-upper” by Nguyen and Widrow [35]. In this problem, a sequence of decisions must be made without an immediate indication of how effective these steps are. No indication of performance exists until the truck hits the wall. Backpropagation through time solves the problem, but it has its own inadequacies and performance difficulties. Despite its tremendous effect on NNs, the algorithm is not without its problems. Some of the problems have been discussed above. In addition, the complexity of the algorithm makes hardware implementations of it very difficult.

The ALOPEX Algorithm

The ALOPEX process is an optimization procedure that has been demonstrated successfully in a wide variety of applications. Originally developed for receptive field mapping in the visual pathway of frogs, ALOPEX’s usefulness and flexible form have increased the scope of its applications to a wide range of optimization problems. Since its development by Tzanakou and Harth in 1973 [28], ALOPEX has been applied to real-time noise reduction [36], pattern recognition [37], adaptive control systems [38], and multilayer NN training, to name a few.

TABLE 182.1


















подпись: table 182.1
 ax ar axar
xt + rt + +
xt + ki - -
xi — rt + -
xi - ki - +
Optimization procedures, in general, attempt to maximize or minimize a function F(). The function F() is called the cost func­tion, and its value depends on many parameters or variables.

When the number of parameters is large, finding the set (xt x2 … xN) that corresponds to the optimal (maximal or minimal) solu­tion is exceedingly difficult. If N were small, then one could perform an exhaustive search of the entire parameter space in order to find the “best” solution. As N increases, intelligent algorithms are needed to quickly locate the solution. Only an exhaustive search can guarantee that a global optimum will be found; however, near — optimal solutions are acceptable because of the tremendous speed improvement over exhaustive search methods. ALOPEX iteratively updates all parameters simultaneously based on the cross-correlation of local changes AX; and on the global response change AR, plus an additive noise. The cross-correlation term AXAR helps the process move in a direction that improves the response. Table 182.1 sHows how this can be used to find a global maximum of R.

All parameters Xt are changed simultaneously at each iteration according to

X. X.(□ 1)mx()№□+ r(182.23)

The basic concept is that this cross-correlation provides a direction of movement for the next iteration. For example, take the case where Xt i and Rt. This means that the parameter Xt decreased in the previous iteration, and the response increased for that iteration. The product AXi AR is a negative number, and thus Xi would be decreased again in the next iteration. This makes perfect sense because a decrease in Xj produced a higher response; if you are looking for the global maximum, then Xt should be decreased again. Once Xt is decreased and R also decreases, then AX; AR is now positive, and Xt increases.

These movements are only tendencies, since the process includes a random component that will act to move the weights unpredictably, avoiding local extrema of the response. The stochastic element of the algorithm helps it to avoid local extrema at the expense of a slightly longer convergence or learning period.

The general ALOPEX updating equation (Eq. 182.23) is explained as follows: X(n) are the parameters to be updated, n is the iteration number, and R() is the cost function, of which the “best” solution in terms of Xj is sought. Gamma (y) is a scaling constant, and r(n) is a random number from a Gaussian
distribution whose mean and standard deviation are to find the “best” solution. As N increases, intelligent algorithms are needed to quickly locate the solution. Only an exhaustive search can guarantee that a global optimum is found; however, near-optimal solutions are acceptable because of the tremendous speed improvement over exhaustive search methods.

Backpropagation described earlier, being a gradient-descent method, often gets stuck in local extrema of the cost function. The local stopping points often represent unsatisfactory convergence points. Tech­niques have been developed to avoid the problem of local extrema, with simulated annealing [39] being the most common. Simulated annealing incorporates random noise, which acts to dislodge the process from local extrema. Crucial to the convergence of the process is that the random noise be reduced as the system approaches the global optimum. If the noise is too large, the system will never converge and can be mistakenly dislodged from the global solution. ALOPEX is another process that incorporates a sto­chastic element to avoid local extremes in search of the global optimum of the cost function. The cost function or response is problem-dependent and is generally a function of a large number of parameters. ALOPEX iteratively updates all parameters simultaneously based varied, and AX,(n) and AR(n) are found


□X, (□□ xQn □ Ј□ x[ □ 2^ (182.24)

□ROt rQ □ Ј□ rQ □ 2^ (182.25)

The calculation of R() is problem-dependent and can be easily modified to fit many applications. This flexibility was demonstrated in the early studies of Harth and Tzanakou [29]. In mapping receptive fields, no a priori knowledge or assumptions were made about the calculation of the cost function; instead, a “response” was measured. By using action potentials as a measure of the response [28, 29, 40, 41], receptive fields could be determined by using the ALOPEX process to iteratively modify the stimulus pattern until it produced the largest response.

It should be stated that due to its stochastic nature, efficient convergence depends on the proper control of both the additive noise and the gain factor y. Initially, all parameters Xt are random, and the additive noise has a gaussian distribution with mean 0 and standard deviation a initially large. The standard deviation a decreases as the process converges to ensure a stable stopping point. Conversely, y increases

With iterations. As the process converges AR becomes smaller and smaller, and an increase in y is needed

To compensate for this.

Additional constraints include a maximal change permitted for Xt for one iteration. This bounded step size prevents the algorithm from drastic changes from one iteration tot he next. These drastic changes often lead to long periods of oscillation, during which the algorithm fails to converge.

Two-Dimensional Template Matching with ALOPEX

In this application, ALOPEX is used in template matching. A pattern is compared with a list of j templates, and for each template j, a response Rj is calculated. ALOPEX then iteratively modifies the initial pattern according to the global response calculation, local changes in the pattern, and an additive noise element. The updating equations are as described in the preceding subsection (Eqs. 182.23 to 182.25). An expla­nation of the response calculation used in template matching follows X,(n) are the parameters to be optimized, and tyj are the stored templates; j is the number of the templates.

The response, in this case, is given by


Rj[№^ X(182.26)

I □!

Other methods [40] of calculating R(n) include making it equal to a weighted sum of all the Rj:


R^DnQ RjQWj(D (182.29)

J =1

N n R, QD

Where Wj ([□— . (182.30)

J □ R, UD


The weighting of Rj by Wj and summing tends to slow the convergence of the algorithm, and hence Eq. (182.28) is used to select R(n) from all R-(n).

Using the simple response calculation in Eq. (182.26) for a pixel-by-pixel multiplication between the pattern and template j, good results were obtained for binary (two-valued) images. The algorithm has certain constraints that dictate how it finds the maximum response. However, when multivalued templates were used, there was no longer proper convergence.

Using a response calculation based on pixel-by-pixel correlation is not suited for intermediate-valued optimization. In binary-valued optimization, the tendency to lump the maximum allowed luminance into a single pixel does not exist; therefore, the problem can go undetected. The basic problem is that in supervised learning the ideal image or solution should have maximal response, and for the example shown above, this was not the case.

To solve this problem, a new response calculation had to be implemented. Understanding the previous problem and goal of any optimal response calculation, a new function was used. The new response function given by


Rj&□□□ □( □□..(182.31)

I = 1

Defines a downward-facing parabola with its apex (or maximum) occurring only when X = 9. This is basically a least square error calculation that has been inverted so as to produce a maximum instead of a minimum. Also, the maximum contribution for each pixel is exactly 9; this is important because now the maximal response of Rj(n) is simply equal to the total luminance. R(n) = CR (n) and R(n) = Max[R-(n)] are the same as before. Using the new response calculation, much better convergence was achieved.

Multilayer Perceptron Network Training by ALOPEX

An MLP also can be trained for pattern recognition using ALOPEX. A response is calculated for the jth input pattern based on the observed and desired output

Rj Of □[ № Of ] (182.32)

Where Ookbs and O^“ are vectors corresponding to Ok for all k. The total response for iteration n is the sum of all the individual template responses R(n):


R&tO RjOt (182.33)

J □!

In Eq. (182.33), m is the number of templates used as inputs. ALOPEX iteratively updates the weights using both the global response information and local weight histories, according to the following:

!□□ r, [Q[+m W,№r[[[ WjQ □ 1[ (182.34)

WjkOtp r[+mwjkQ[hR[+ WikQ □ 1[ (182.35)

Where y is an arbitrary scaling factor, r,(n) is an additive gaussian noise, AW represents the local weight

Change, and AR represents the global response information. These values are calculated by

□ Wj[№ WjQ□ 1[ w[□ 2[ (182.36)

WjkQCt] WjkQ□ 1[ WjkQ□ 2[ (182.37)

And □rQ[h rQ□ 1[ rQ□ 2[ (182.38)

Besides its universality to a wide variety of optimization procedures, the nature of the ALOPEX algorithm makes it suitable for VLSI implementation. ALOPEX is a biologically influenced optimization procedure that uses a single-value global response feedback to guide weight movements toward their optimum. This single-value feedback, as opposed to the extensive error-propagation schemes of other NN training algorithms, makes ALOPEX suitable for fast VLSI implementation.

VLSI Applications of Neural Networks

Much debate exists as to whether digital or analog VLSI design is better suited for NN applications. In general, digital designs are an easier to implement and better-understood methodology. In digital designs, computational accuracy is only limited by the chosen word length. While analog VLSI circuits are less accurate, they are smaller, faster, and consume less power than digital circuits [42]. For these reasons, applications that do not require great computational accuracy are dominated by analog designs.

Learning algorithms, especially backpropagation, require high precision and accuracy in modifying the weights of the network. This has led some to believe that analog circuits are not well suited for implementing learning algorithms [43]. Analog circuits can achieve high precision, at the cost of increas­ing the circuit size. Analog circuits with high precision (8 bits) tend to be equally large as their digital counterpart [44]. Thus high-precision analog circuits lose their size advantage over digital circuits. Analog circuits are of greater interest in applications requiring only moderate precision.

Early studies show that analog circuits can realize learning algorithms, provided that the algorithm is tolerant to hardware imperfections such as low precision and inherent noise. In a paper by Macq et al. [45], a full analog implementation of a Kohonen map [20], one type of neural network, with on-chip learning is presented. With analog circuits having been shown capable of the computational accuracy necessary for weight modification, they should continue to be the choice of NN research.

Size, speed, and power consumption are areas where analog circuits are far superior to digital circuits, and it is these areas that constrain most NN applications. To achieve greater network performance, the size of the network must be increased. The ability to implement larger, faster networks is the major motivation for hardware implementation, and analog circuits are superior in these areas. Power con­sumption is also of major concern as networks become larger [46]. As the number of transistors per chip increases, power consumption becomes a major limitation. Analog circuits dissipate less power than digital circuits, thus permitting larger implementations.

One of the leaders in the field of analog VLSI neural systems is Carver Mead, who has presented a methodology for implementing biologically inspired networks in analog VLSI [47]. A silicon retina modeled after the biological retina is constructed from simple analog circuits. These simple analog circuits function as building blocks and have been used in many other designs. Some of Mead’s other work includes the design of a tracker [48], a sensorimotor integration system, capable of tracking a bright spot of light. An artificial cochlea for use in auditory localization also has been designed in analog VLSI [49].

Other methodologies of VLSI design of NNs include an analog/digital hybrid approach using pulse — stream networks. Pulse-stream encoding has been used in communication systems for many years and was first reported in the context of NNs by Murray and Smith in 1987 [50]. NN pulse-stream techniques attempt to combine the size and speed efficiency of analog circuits with the accuracy of digital circuits [51].

Recently, a digital VLSI approach to implementing the ALOPEX algorithm was undertaken by Pandya et al. [52]. Results of their study indicated that ALOPEX could be implemented using a single-instruction multiple-data (SIMD) architecture. A simulation of the design was carried out, in software, and good convergence for a 4 x 4 processor array was demonstrated.

In our laboratory, an analog VLSI chip has been designed to implement the ALOPEX algorithm by Zanher et al. [53]. By making full use of the algorithm’s tolerance to noise, an analog design was chosen. As discussed earlier, analog designs offer larger and faster implementations than do digital designs.

Applications in Biomedical Engineering

Expert Systems and Neural Networks

Computer-based diagnosis is an increasingly used method that tries to improve the quality of health care. Systems that depend on artificial intelligence (AI), such as knowledge-based systems or expert systems, as well as hybrid systems, such as the above combined with other techniques, like NNs, are coming into play. Systems of this sort have been developed extensively in the last 10 years with the hope that medical diagnosis and therefore medical care would improve dramatically. Hatziglygeroudis et al. [54] are developing such a system with three main components: a user interface, a database management system, and an expert system for the diagnosis of bone diseases. Each rule of the knowledge representation part is an ADALINE unit that has as inputs the conditions of the rule. Each condition is assigned a significance factor corresponding to the weight of the input to the Adaline unit, and each rule is assigned a number, called a bias factor, that corresponds to the weight of the bias input of the unit. The output is calculated as the weighted sum of the inputs filtered by a threshold function.

Hudson et al. [55] developed an NN for symbolic processing. The network has four layers. A separate decision function is used for layer three and a threshold for each node in the same layer. If the value of the decision function exceeds the corresponding threshold value, a certain symbol is produced. If the value of the decision function does not exceed the threshold, then a different symbol is produced. The so generated symbols of adjacent nodes are combined at layer four according to a well-structured grammar. A grammar provides the rules by which these symbols are combined [56]. The addition of a symbolic processing layer enhances the NN in a number of ways. It is, for instance, possible to supplement a network that is purely diagnostic with a level that recommends further actions or to add additional connections or nodes in order to more closely simulate the nervous system.

With increasing network complexity, parameter variance increases, and the network prediction becomes less reliable. This difficulty can be overcome if some prior knowledge can be incorporated into the NN to bias it [57]. In medical applications in particular, rules can either be given by experts or can be extracted from existing solutions to the problems. In many cases the network is required to make reasonable predictions before it has gone through any sufficient training data, relying only on a priori knowledge. The better this knowledge is initially, the better is the performance and the shorter is the training [58, 59].

Applications in Mammography

One of the leading causes of death of women in the United States is breast cancer. Mammography has been proved to be an effective diagnostic procedure for early detection of breast cancer. An important sign in its detection is the identification of microcalcifications of mammograms, especially when they form clusters. Chan et al. [60] have developed a computer-aided diagnosis (CAD) scheme based on filtering and feature-extracting methods. In order to improve on the false-positive results, Zhang et al. [61] applied an artificial NN that is shift-invariant. They evaluated the performance of the NN by the “jack-knife” method [62] and receiver operating characteristic analysis [63, 64]. A shift-invariant NN is a feedforward NN with local, spatially invariant interconnections similar to those of the neocognition [65] but without the lateral interconnections. Backpropagation also was used for training for individual microcalcifications, and a cross-validation technique also was used in order to avoid overtraining. In this technique, the data set is divided into two sets, one used for training and the other for validating the predetermined intervals. The training of the network is terminated just before the performance of the network for the validating set decreases. The shift-invariant NN was proven to be much better in dropping the false-positive classifications by almost 55% over previously used NNs.

In another study, Zheng et al. [66] used a multistage NN for detection of microcalcification clusters with almost 100% success and only one false-positive result per image. The multistate NN consists of more than one NN connected in series. The first stage is called the detail network, with inputs the pixel values of the original image, while the second network, the feature network, gets as inputs the output from the first stage and a set of features extracted from the original image. This approach has higher sensitivity of classifications and a lower false-positive detection than the previous reports.

Another approach was used by Floyd et al. [67], where radiologists read the mammograms and come up with a list of eight findings, which were used as features for an NN. The results from biopsies were taken as the truth of diagnosis. For indeterminate cases, as classified by radiologists, the NN had a performance index of 0.86, which is quite high.

Downes [68] used similar techniques to identify stellate lesions. He used texture quantification via fractal-analysis methods instead of using the raw data. In mammograms, specific textures are usually indicative of malignancy. The method used for calculating the fractal dimension of digitized images was based on the relationship between the fractal dimension and the power spectral density.

Giger et al. [69] aligned the mammograms of left and right breasts and used a subtraction technique to find initial candidate masses. Various features were then extracted and used in conjunction with NNs in order to reduce false-positive assessments resulting from bilateral subtraction. Receiver operating characteristic (ROC) analysis was applied to evaluate the output of the NN. The methods used were evaluated using pathologically confirmed cases. This scheme yielded a sensitivity of 95% at an average of 2.5 false-positive detections per image.

Chromosome and Genetic Sequences Classification

Several clinical disorders are related to chromosome abnormalities that are difficult to identify accurately and also classify the individual chromosome. Automated systems can greatly help the human capabilities in dealing with some of the problems involved. One way to deal with this problem is the use of NNs. Several studies have already been done toward enhancing the ability of an automated computerized system to analyze chromosome identification [70]. One such study by Sweeney and Musavi [71] analyzes the metaphase of chromosome spreads employing probablistic NNs (PNNs), which have been used as alternatives to various classification problems. First introduced by Specht [72, 73], PNNs are combina­tions of a kernel-based estimator for estimation of probability densities and the Bayes rule for classification decision. The estimation with the highest value specifies the correct class. Thus training of PNNs means to find appropriate kernel functions, usually taken to be gaussian densities, and therefore, the problem is reduced to the selection of a scalar parameter, namely, the standard deviation, of the gaussian. A way to improve the accuracy of a PNN for chromosome classification is to use the knowledge that there can be a maximum of only two chromosomes assigned to each class. This knowledge can be easily incorpo­rated into the NN. Similar or better results were obtained to the classic backpropagation-trained NN.

A hybrid symbolic/NN machine learning algorithm was introduced by Noordewier et al. [74] for the recognition of genetic sequences. The system uses a knowledge base of hierarchically structured rules to form an artificial NN in order to improve the knowledge base. They used this system in recognizing genes in DNA sequences. The learning curve of this system was compared with that of a randomly initialized, fully connected two-layer NN. The knowledge-based NN learned much faster than the other one, but the error of the randomly initialized NN was slightly lower (5.5% versus 6.4%). Methods also have been devised to investigate what the NN has learned by an automatic translation into symbolic rules of trained NN initialized by the knowledge-based method [75].

Medial axis transform (MAT)-based features as inputs to an NN have been used in studying human chromosome classification [76]. Prenatal analysis, genetic syndrome diagnosis, and others make this research very important. Human chromosome classification based on NN requires no a priori knowledge or even assumptions on the data. MAT is a widely used method for transformations of elongated objects and requires less storage and time while preserving the topologic properties of the object. MAT also allows for a transformation from a 2D image to a 1D representation of it. The so obtained features are then fed as inputs to a two-layer feedforward NN trained by back-propagation, with almost perfect results in classifying chromosomes. An optimization on an MLP also was done [77].


Mueller P, Lazzaro J. 1986. Real time speech recognition. In J Dember (ed), Neural Networks for Computing. New York, American Institute of Physics.

Bourland H, Morgan N. 1990. A continuous speech recognition system embedding a Multi-Layer Perceptron into HMM. In D Touretzky (ed), Advances in Neural Information Processing Systems

Pp 186-193. San Mateo, Calif, Morgan Kauffman.

Bridle JS, Cox SJ. 1991. RecNorm: Simultaneous normalization and classification applied to speech recognition. In RP Lippman et al (eds), Advances in Neural Information Processing Systems 3, pp 234-240.

Lee S, Lippman RP. 1990. Practical characteristics of neural networks and conventional classifiers on artificial speech problems. In DS Touretzky (ed), Advance in Neural Information Processing Systems 2, pp 168-177, San Mateo, Calif, Morgan Kauffman.

Fukushima K. 1983. Neocognition: A neural network model for a mechanism of visual pattern recognition. IEEE Trans Syst Man Cybernet SMC-13(5):826.

Dasey TJ, Micheli-Tzanakou E. 1994. An unsupervised system for the classification of hand-written digits, comparison with backpropagation training. IEEE Trans Neural Networks.

LeCun Y, Boser B, Denker JS, et al. 1989. Backpropagation applied to handwritten ZIP code recognition. Neural Comp 1(4):541.

Hakim N, Kaufman JJ, Cerf G, Medows HE. 1990. A discrete time neural network model for system identification. Proc of IJCNN 90(V3):593.

Hesh D, Abdallah C, Horne B. 1991. Recursive neural networks for signal processing and control. In Proceedings of the First IEEE-SP Workshop on Neural Networks for Signal Processing, Princ­eton, NJ, pp 523-532.

Cottrell GW, Munro PN, Zipser D. 1989. Image compression by backpropagation: A demonstration of extensional programming. In Advances in Cognitive Science, vol. 2. Norwood, NY, Ablex.

Oja E, Lampinen J. 1994. Unsupervised learning for feature extraction. In JM Zurada et al (eds), Computational Intelligence Imitating Life, New York, IEEE Press.

Fogelman Soulie F. 1994. Integrating neural networks for real world applications. In JM Zurada et al (eds), Computational Intelligence Imitating Life. New York, IEEE Press.

Rosenblatt F. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol Rev 65:386.

Rosenblatt F. 1962. Principles of Neurodynamics. New York, Spartan Books.

Lippman, RP. 1987. An introduction to computing with neural nets. IEEE ASSP Mag 4-22.

Moore K. 1992. Artificial neural networks: Weighing the different ways to systemize thinking. IEEE Potentials 23.

Huang S, Huang Y. 1991. Bounds on the number of hidden neurons in multilayer perceptrons. IEEE Trans Neural Networks 2(1):47.

Kung SY, Hwange J, Sun S. 1988. Efficient modeling for multilayer feedforward neural nets. Proc IEEE Conf Acoustics, Speech Signal Processing, New York, pp 2160-2163.

Mirchandani G. 1989. On hidden nodes for neural nets. IEEE Trans Circuits Syst 36(5):661.

Kohonen T. 1988. Self-Organization and Associative Memory. New York, Springer-Verlag.

Hecht-Nielsen R. 1987. Counterbackpropagation networks. In Proc of the IEEE First International Conf on Neural Networks, vol 2, pp 19-32.

Dasey TJ, Micheli-Tzanakou E. 1992. The unsupervised alternative to pattern recognition: I. Clas­sification of handwritten digits. In Proc 3rd Workshop on Neural Networks, Auburn, Ala, pp 228-233.

Dasey TJ, and Micheli-Tzanakou E. 1992. The unsupervised alternative to pattern recognition: II. Detection of multiple sclerosis with the visual evoked potential. In Proc 3rd Workshop on Neural Networks, Auburn, Ala, pp 234-239.

McCulloch WC, Pitts W. 1943. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115.

Hebb D. 1949. The Organization of Behavior. New York, Wiley.

Widrow B, Lehr MA. 1990. 30 years of adaptive neural networks: Perceptron, Madaline, and backpropagation. Proc IEEE 78(9):1415.

Minsky M, Papert S. 1969. Perceptrons: An Introduction to Computational Geometry. Cambridge, Mass, MIT Press.

Tzanakou E, Harth E. 1973. Determination of visual receptive fields by stochastic methods. Biophys J 15:(42a).

Harth E, Tzanakou E. 1974. Alopex: A stochastic method for determining visual receptive fields. Vis Res 14:1475.

Tzanakou E. Michalak R, Harth E. 1979. The ALOPEX process: Visual receptive fields by response feedback. Biol Cybernet 35:161.

Rumelhart DE, McClelland JL (eds). 1986. Parallel Distributed Processing Cambridge, mass, MIT Press.

Holstrom L, Koistinen P. 1992. Using additive noise in backpropagation training. IEEE Trans Neural Networks 3(1):24.

Fahlmann SE. 1988. Faster learning variations of backpropagation: An empirical study. In D Touretsky et al (eds), Proc of the Connectionist Models Summer School. San Mateo, Calif, Morgan Kaufmann.

Werbos PJ. 1990. Backpropagation through time: What it does and how to do it. Proc IEEE 78(30):1550.

Nguyen D, Widrow B. 1989. The truck backer-upper: An example of self-learning in neural net­works. In Proc of the Int Joint Conf Neural Networks, vol II, pp 357-361. New York, IEEE Press.

Ciaccio E, Tzanakou E. 1990. The ALOPEX process: Application to real-time reduction of motion artifact. Annu Int Conf IEEE EMBS 12(3):1417.

Dasey TJ, Micheli-Tzanakou E. 1990. A pattern recognition application of the Alopex process with hexagonal arrays. In Int Joint Conf Neural Networks, vol II, pp 119-125.

Venugopal K, Pandya A, Sudhakar R. 1992. ALOPEX algorithm for adaptive control of dynamical systems. Proc IJCNN 2:875.

Kirkpatrick S, Gelatt CD, Vecchi MP. 1983. Optimization by simulated annealing. Science 220:671.

Micheli-Tzanakou E. 1984. Non-linear characteristics in the frog’s visual system. Biol Cybernet 51:53.

Micheli-Tzanakou E. 1983. Visual receptive fields and clustering. Behav Res Methods Instrum 15(6):553.

Mead C, Ismail (eds). 1991. Analog VLSI Implementation of Neural Systems. Boston, Kluwer Academic Publishers.

Ramacher and Ruckert (eds). 1991. VLSI Design of Neural Networks. Boston, Kluwer Academic Publishers.

Graf HP, Jackel LD. 1989. Analog electronic neural network circuits. IEEE Circuits Devices Mag 44.

Macq D, Verlcysen M, Jespers P, Legat J. 1993. Analog implementation of a Kohonen map with on-chip learning. IEEE Trans Neural Networks 4(3):456.

Andreou A, et al. 1991. VLSI neural systems. IEEE Trans Neural Networks 2(2):205.

Mead C. 1989. Analog VLSI and Neural Systems. Reading, Mass, Addison-Wesley.

Maher M, Deweerth S, Mahowald M, Mead C. 1989. Implementing neural architectures using analog VLSI circuits and systems. 36:(5):643.

Lazzaro J, Mead C. 1989. Silicone models of auditory localization. Neural Comput 1(1):1.

Murray AF, Smith A. 1987. Asynchronous arithmetic for VLSI neural systems. Electron Lett 23(12):642.

Murray AF, Corso D. Tarassenko L. 1991. Pulse-stream VLSI neural networks mixing analog and digital techniques. IEEE Trans Neural Networks 2(2):193.

Pandya S, Shandar R, Freytag L. 1990. An SIMD architecture for the Alopex neural networks. SPIE Parallel Architectures for Image Processing 1246:275.

Zahner AD. 1994. Design and implementation of an analog VLSI, adaptive algorithm with appli­cations in neural networks, Master’s thesis, Rutgers University.

Hatziglygeroudis I, Vassilakos PJ, Tsakalidis A. 1994. An intelligent medical system for diagnosis of bone diseases. In Proc of the Int Conf on Med Physics and Biomed Eng., Cyprus, vol 1, pp 148-152.

Hudson DL, Cohen ME, Deedwania PC. 1993. A neural network for symbolic processing. In Proc of the 15th Annual Int Conf of the IEEE/EMBS, vol 1, pp 248-249.

Hoperoft JE, Ullman JD. 1969. Formal Languages and Their Relation to Automata. Reading Mass, Addison-Wesley.

Roscheisen M, Hofmann R, Tresp V. 1992. Neural control for running mills: Incorporating domain theories to overcome data deficiency. In Advances in Neural Information Processing Systems 4.

Towell GG, Shavlik JW, Noordemier MO. 1990. Refinement of approximately correct domain theories by knowledge-based neural networks. In Proc of the 8th National Conf Artif Intell, pp 861-866.

Tresp V, Hollatz J, Ahmad S. 1994. Network structuring and training using rule-based knowledge. In Advances in Neural Information Processing Systems 5, pp 871-878.

Chan H-P, Doi K, Vyborny CJ, et al. 1990. Improvement in radiologists’ detection of clustered microcalcifications on mammograms: The potential of computer aided diagnosis. Invest Radiol 25:1102.

Zhang W, Giger ML, Nishihara RM, Doi K. 1994. Application of a shift-invariant artificial neural network for detection of breast carcinoma in digital mammograms. In Proc of the World Congress on Neural Networks, vol 1, pp 45-52.

Fukunaga K. 1990. Introduction to Statistical Pattern Recognition, 2nd ed. New York, Academic Press.

Metz CE. 1988. Current problems in ROC analysis. In Proc Chest Imaging Conf Madison Wisc, pp 315-336.

Metz CE. 1989. Some practical issues of experimental design and data analysis in radiological ROC studies. Invest Radiol 24:234.

Fukushima K, Miyake S, Ito T. 1983. Neocognitron: A neural network model for a mechanism of visual pattern recognition. IEEE Trans Syst Man Cybernet SMC-13:826.

Zheng B, Qian W, Clarke LP. 1994. Artificial neural network for pattern recognition in mammog­raphy. In Proc of the World Congress on Neural Networks, vol. I, pp 57-62.

Floyd CE Jr, Yun AJ, Lo JY, et al. 1994. Prediction of breast cancer malignancy for difficult cases using an artificial neural network. In Proc of the World Congress on Neural Networks, vol I, pp 127-132.

Downes P. 1994. Neural network recognition of multiple mammographic lesions. In Proc of the World Congress on Neural Networks, vol I, pp 133-136.

Giger ML, Lu P, Huo Z, Zhang W. 1994. Application of artificial neural networks to the task of merging feature data in computer-aided diagnosis schemes. In Proc of the World Congress on Neural Networks, vol I, pp 43-46.

Piper J, Granunn E, Rutovitz D, Ruttledge H. 1990. Automation of chromosome analysis. Sig Proc 2(3):109.

Sweeney WP Jr, Musavi MT. 1994. Application of neural networks for chromosome classification. In Proc of the 15th Annual Int Conf of the IEEE/EMBS, vol 1, pp 239-240.

Specht DF. 1988. Probablistic neural networks for classification mapping or associative memory. In Proc of the IEEE Int Conf on Neural Networks, vol 1, pp 525-532. New York, IEEE Press.

Specht DF. 1990. Probablistic neural networks. 3(1):109.

Noordewier MO, Towell GG, Shavlik JW. 1993. Training knowledge-based neural networks to recognize genes in DNA sequences. Adv Neural Info Proc Sys 3:530.

Towell GG, Graven M, Shavlik JW. 1991. Automated interpretation of knowledge-based neural networks. Tech rep Univ of Wisc Computer Sci Dept, Madison, Wisc.

Lerner B, Rosenberg B, Levistein M, et al. 1994. Medial axis transform based features and a neural network of human chromosome classification. In Proc of the World Congress on Neural Networks, vol 3, pp 173-178.

Lerner B, Guterman H, Dinstein J, Romem Y. 1994. Learning curves and optimization of a mul­tilayer perceptron neural network for chromosome classification. In Proc of the World Congress on Neural Networks, vol 3, pp 248-253.

Nykanen, P., Saranummi, N. “Clinical Decision Systems.” The Biomedical Engineering Handbook: Second Edition. Ed. Joseph D. Bronzino Boca Raton: CRC Press LLC, 2000

Artificial Intelligence in Medical Decision Making: History, Evolution, and Prospects

Early Models of Medical Decision Making

Emergence of the Knowledge-Based AI Methods for Medical Consultation

The Transition to Expert Systems and the Ascendancy of Rule-Based Systems—1976-1982

Exploration of Alternative Representations for Medical AI and the Search for Performance— 1983-1987

Explanation and Early Knowledge Level Work in AI Systems • Performance of Expert Medical Systems • Formal Theories of Reasoning Under Uncertainty in Medicine

The Past Decade—Structure, Formalism, and Empiricism

Deep Medical Knowledge for Representation and Problem Solving • Formal Decision-Making Methods and Empirical Systems

Prospects for AI in Medicine—Problems and Challenges

Casimir Kulikowski Usefulness of AI Decision Support Systems • Opportunities:

Rutgers University Knowledge-Bases, Systems Integration, and the Web

Artificial intelligence (AI) for medical decision making and decision support had its origin in the knowledge-intensive expert consultation systems that were introduced in the early 1970s. These differed considerably from the data-intensive statistical and pattern-recognition methods that had been applied to medical reasoning problems since the 1960s and which saw a resurgence in the 1980s as new types of computationally more powerful artificial neural networks (ANNs) were developed, and sophisticated models of Bayesian and other belief networks were designed to capture the nuances of clinical reasoning. For the past decade it has become incresingly clear that a variety of approaches is needed to make computer-based medical decision systems useful and effective, preferably embedding them within systems that are based on an electronic medical record (EMR). Meanwhile, advanced software environments increasingly and routinely incorporate AI ideas and methods as they seek to facilitate the tasks of building, validating, and testing medical knowledge bases.

Early Models of Medical Decision Making

The earliest efforts to formalize medical decision making involved the application of statistical decision methods (ROC curves) in radiographic interpretation. Precursor attempts at automating the logic of diagnostic reasoning by sorting symptoms and selecting diagnoses that matched a particular combination involved multiple slide-rule and early card and computer sorting techniques.

As computers became more powerful and easily programmable, they were the natural tool for repre­senting diagnostic and treatment decision making. A watershed article by Ledley and Lusted [1959], in Science, described the reasoning bases for medical decision making in both logical and statistical terms. The main paradigm for representing medical decisions in the 1960s was statistical, whether Bayesian, hypothesis testing, or discriminant function analysis. The growth of standardized laboratory test panels and multiphasic screening methods had the effect of popularizing statistical techniques as an objective approach to the selection of decision thresholds. Bayesian methods were predominant and allowed incorporation of subjective estimates of prior probability into the calculation of diagnostic probabilities. Likelihood-ratio (hypothesis-testing) methods did not, but instead required a determination of sensitiv­ity/specificity trade-offs with threshold selection dependent on a choice of clinically tolerable levels for them. There also were heuristic pattern-recognition methods being developed if more flexibility was required, but all these approaches at that time suffered from a common drawback. While they might perform very well for circumscribed problems with well-defined statistics or adequate samples from which to estimate the statistics, they were rarely acceptable to practicing physicians beyond their site of origin. A reason frequently given for this was the difficulty of explaining decisions based on strictly computational theories of probability in terms of the qualitative language and arguments familiar to physicians.

During the 1960s, the alternative to formal statistical methods was the coding of the sequence of expert decisions in a branching logic diagram or flowchart, often described as a clinical decision algorithm. This approach had the advantages of clarity and ease of explanation but was usually too rigid to capture the nuances of context without becoming very large, complex, and computationally expensive. During this period, most mainstream AI research appeared to have little to offer medical decision making, since it was concerned primarily with the computer representation and reasoning for general problem solving without considering explicit representations of uncertainty.

Emergence of the Knowledge-Based AI Methods for Medical Consultation

In the early 1970s, several groups involving both AI and biomedical researchers decided to explore different, more knowledge-intensive approaches to interpretive problem solving, including medical deci­sion making. Several common themes emerged from the work of the four groups that initiated these efforts at Rutgers University, Stanford University, the University of Pittsburgh, and the MIT/Tufts col­laborative group. All of them, at about the same time, recognized some of the shortcomings of existing methods and proposed a set of alternative representation and inferencing approaches that involved:

Representing uncertainty more flexibly and qualitatively than appeared possible with probabilities.

Representing more of the medical knowledge that motivated and justified a diagnostic, prognostic, or therapeutic decision.

Developing a descriptive component of medical knowledge to which some general problem solving or inferencing strategy could be applied.

Early prototypes of AI consultation systems that incorporated distinct approaches to medical decision making were CASNET [Kulikowski and Weiss, 1972], MYCIN [Shortliffe et al., 1974], DIALOG/INTER­NIST-1 (Pople et al., 1975], the Present Illness Program (PIP) [Pauker et al., 1976], and the Digitalis Advisor Program [Gorry et al., 1978].

That a domain-specific model could be used as the basis for interpretive decision making was first demonstrated in the DENDRAL system [Buchanan et al., 1970]. DENDRAL interpreted mass spectra by using rules that were constrained by a model of the possible chemical structures that could have given rise to the observed spectral data. This was quite different from the domain-independent, pattern — recognition approaches previously applied to the problem.

The first AI approach to decision making in medicine itself evolved within the Rutgers Research Resource on Computers in Biomedicine, which was established in 1972. One of the goals of our research group was to find general methods of representing medical reasoning that would take advantage of the knowledge of specific diseases in ways similar to those used by medical specialists. In a collaboration with Dr. Aran Safir, an ophthalmologist at the Mt. Sinai School of Medicine (New York), we explored methods for describing and applying medical knowledge for computer-based diagnosis and management. With Sholom Weiss, who worked on his doctoral dissertation on this project, we found that a very natural way for characterizing the underlying mechanisms of disease was through cause-and-effect relations. A network of causal links among pathophysiologic conditions (or abnormal physiologic states) could then be used to describe the many pathways through which a disease might manifest itself. Each pathophys­iologic state could be inferred independently from the patient’s condition and asserted with some degree of confidence depending on its pattern of supportive evidence. A prototype system for testing these ideas was developed in the area of glaucoma diagnosis and management. This proved to be a good problem, because glaucoma is a major cause of loss of vision and blindness with mechanisms that are sufficiently well understood to determine the course of management; yet at the same time sufficiently complex that advice on difficult cases (particularly those resistant to conventional treatment) must frequently be sought from specialist consultants.

A variety of reasoning strategies seemed to apply to different stages of the consultation. For example, while starting to gather data on a patient, the major problem for a consultant is to elucidate the patient’s problem and develop diagnostic leads by asking the right questions and acquiring the relevant data. When enough data have accumulated, a different mode of reasoning is characteristically needed. The findings must be attributed to specific causes, whether single or multiple. Finally, all these explanations must be combined into a differential diagnosis. This serves to predict what will happen if the patient remains untreated (prognosis) and suggests possible ways of managing the disease (treatment or thera­peutic planning). As we gained experience in extracting the knowledge from the expert practitioners, it became clear that different management strategies relied on different types of medical knowledge, and these strategies had to be sufficiently general and independent of the particular disease being modeled. Weiss built a prototype system called CASNET (for casual associational network) based on these ideas [Weiss, 1974; Weiss et al., 1978]. We were fortunate in being able to draw on the expertise of leading glaucoma specialists, who provided the in-depth knowledge needed to describe the disease and tested the system independently in their own laboratories. After two years of improving the initial prototype, the system was given a blind test by presenting it with new, previously unseen cases during a panel discussion at the major national meeting of the Academy of Ophthalmology in 1976. CASNET made no errors in its recommendations, though it was unable to answer some questions for which the knowledge was absent in the computer model [Lichter and Anderson, 1977].

During the early 1970s, another kind of biomedical AI resource was established at Stanford as the result of the DENDRAL project collaboration that had developed between AI researchers (E. Feigenbaum and B. Buchanan) and biomedical researchers (J. Lederberg and S. Cohen). This SUMEX-AIM Resource, established in 1974, provided advanced time sharing on a PDP-10 computer to a national network of investigators of AI in medicine, with appropriate systems and AI language support for what was then a very novel computational mode, serving an initial nucleus of investigators, including those at Rutgers and the University of Pittsburgh [Freiherr, 1979].

The medical component of the SUMEX-AIM research centered around the consultation system for infectious disease treatment developed by Shortliffe [Shortliffe et al., 1974; Shortliffe, 1976]. Its repre­sentation of medical knowledge was in the form of heuristic rules with a new certainty factor formalism, which decoupled the way in which positive and negative evidence was credited toward a hypothesis, thus providing more heuristic flexibility than previous probabilistic frameworks. The system’s strategy was goal driven for gathering the data needed to reach the diagnostic conclusions necessary to choose the appropriate covering therapies. The modularity of MYCIN’s rules turned out to be a critical design choice, since it was soon found that it was easy to modify the rules independently of the reasoning strategy and its inference engine, giving rise to a very flexible and updatable knowledge base. The certainty-factor model turned out to be less long-lived, since it was soon found to implicitly involve assumptions and constraints that did not adequately represent the complex dependencies that often exist among data and hypotheses. However, MYCIN, by providing an easy means of directly and modularly encoding the rules of medical reasoning (without an underlying disease model as in CASNET), gave rise to the most powerful representation for the first generation of knowledge-based AI systems—the rule-based paradigm [Bucha­nan and Shortliffe, 1984].

Another group to work intensely on AI approaches to medical decision making was that of Gorry at MIT and Pauker and Schwartz at Tufts, who led the development of the Digitalis Advisor Program (Gorry et al., 1978]. This system introduced two interesting new concepts: a patient-specific model (PSM) and a mathematical model for describing the mechanism being regulated and interpreted (digitalis uptake). The PSM grouped together all the information known about a patient during a consultation session (the findings, history, test results, etc.), as well as the hypotheses about the patient (diagnostic, prognostic, and therapeutic) produced by the computer diagnostic/therapeutic model. The mathematical model was a compartmental model of digitalis uptake, which was able to relate dosages to their expected therapeutic and toxic effects.

In a related AIM project, Pauker and Kassirer joined Gorry and Schwartz in formulating the Present Illness Program (PIP) [Pauker et al., 1976], which was designed to focus in on a medical problem based on a brief description of the patient’s complaints (the present illness). Here, Minsky’s frame formalism [Minsky, 1975], which had been developed to describe a template of expected objects in computer vision recognition tasks, was used as a template for grouping typical or characteristic information about a disease. Each frame contained slots that had to be filled with descriptors of the disease, with some of these slots containing rules for reasoning about the disease. In this way, diseases were related through rules, but these rules were grouped by their respective frames. An interesting observation that came from this work was that the probabilistic scoring of hypotheses, while important in diagnostic reasoning, is often overshadowed by the use of categorical or deterministic reasoning, which has to be flexible and suitably richly represented to be useful [Szolovitz and Pauker, 1978]. Subsequently, the ABEL system for modeling acid-base balance problems [Patil et al., 1981] took advantage of many of the ideas from PIP and combined them with a causal description of disease at multiple levels of abstraction and aggregation, which helped provide a more detailed description of pathophysiologic processes.

Contemporaneously, Dr. Harry Pople at the University of Pittsburgh had been investigating biomedical applications of AI—specifically abductive reasoning with causal graphs for scientific theory formation [Pople, 1973]. In a collaboration with Dr. Jack Myers, an eminent internist, Pople began a very broad and ambitious project: to capture all of Dr. Myers’ diagnostic knowledge of internal medicine into a system that could reason automatically from the facts of a case. Over the next few years, they developed a taxonomic and causal representation for characterizing diseases and their manifestations and proceeded to develop a set of alternatives for describing the various strategies of diagnostic reasoning: the preliminary assessment of findings, the attribution of causes and explanations for each of the findings, the compu­tation of confidences and resulting ranking of diagnostic hypotheses, the development of a differential diagnosis, and the confirmation of a diagnosis once it is already strongly indicated (through specialized tests). The DIALOG system was developed to test these ideas and codify Dr. Myers’ knowledge [Pople et al., 1975]. Its later version, called INTERNIST-1 [Pople, 1977; Miller, 1984], demonstrated how the very large domain-specific knowledge base of internal medicine could be assembled and validated with complex cases of disease, both from the clinic and journal CPC reports.

Research in AI in medicine (AIM) was initially presented in a series of AIM workshops, starting at Rutgers University [Ciesielski, 1978], subsequently rotating among the AIM community. Later workshops were organized in conjuction with the American Association for Artificial Intelligence (AAAI) as part of its spring symposia, while sessions on medical expert systems were to be increasingly found at most relevant scientific, engineering, medical, and informatics conferences (AAAS, IEEE, ACM, AAAI, SCAMC, MEDINFO, and AMIA).

The Transition to Expert Systems and the Ascendancy of Rule-Based Systems—1976-1982

The experience in developing, testing, and disseminating the prototypes of the first-generation medical consultation systems, combined with similar experiences of AI researchers outside medicine, led to a shift in AI research away from general problem solving to more domain-specific knowledge-intensive problems. Feigenbaum (1978) defined generalizations of the MYCIN system for dealing with any problem where advice-giving knowledge could be captured in the form of rules. He emphasized the separation of rule-based systems from the inference engine that reasoned with them and coined the term knowledge engineering to describe the art of building a knowledge-based system. This process was centered around the interviewing of domain specialists by knowledge engineers (typically computer scientists or engi­neers), who would attempt to understand the problem being modeled, learn about the expertise that the specialist applied in solving it, and finally formalize all this into a computer representation of the problem­solving or consultative knowledge. The early stages of the process, usually called knowledge acquisition, frequently became the major bottleneck in developing an expert system, particularly if the knowledge engineer had difficulties in understanding the field of expertise or came with preconceived notions about it that did not match the expert’s own. Another difficulty with first-generation expert system shells was that the knowledge had to be fitted to a predetermined computer representation, which might or might not match that of the domain problem. Most shells worked well with advice-giving reasoning that involved classification-type problems that could be reduced to the selection of one or several alternatives from a large set of candidate hypotheses. In this sense, their overall problem-solving paradigm did not differ much from the statistical and pattern-recognition methods that preceded them. Their major departure was in the architectural modularity of the rule base, flexible choice of reasoning strategies, and the representation of intermediate hypotheses and reasoning constructs with which to support and assemble a final conclusion. Researchers interested in the cognitive processes of human expert diagnosticians analyzed the problem-solving behavior of experts confronted with real and simulated cases to discern their reasoning strategies [Elstein et al., 1978; Kassirer and Gorry, 1978]. These were usually found to be sufficiently complex that they did not have an effect on the design of the rapidly spreading rule-based systems of the time. Rather, they influenced the development of more sophisticated systems based on deeper medical knowledge in the following decade.

The years 1976-1978 saw the testing, critiquing, and elaboration of the prototype systems. This resulted in changes and generalizations of the initial designs, leading to the first general system building frame­works (EXPERT [Weiss and Kulikowski, 1979], EMYCIN [van Melle, 1979], AGE [Nii and Aiello, 1979]). These came about because in the course of the research it became clear that incorporating expert knowledge about advice-giving consultation could be carried out with various kinds of reasoning: either or both hypothesis — and data-driven strategies, with corresponding backward or forward chaining of rules. Furthermore, inferences incorporated into the rules could be either interpreted (as in EMYCIN) or compiled (as in EXPERT), and knowledge could be chunked or clustered by subtasks of the problem solving with the aid of various other representational devices (context trees in EMYCIN, knowledge sources in AGE, rule clusters in EXPERT, and frames in PIP).

Building various expert systems capitalized on the increasing experience with alternative ways of representing knowledge. For example, the IRIS system [Trigoboff, 1976] applied a semantic network model (as was being developed in PROSPECTOR [Hart and Duda, 1977]) to represent glaucoma con­sultation knowledge with multiple strategies of inference; the CRYSALIS system [Englemore and Terry,

Which explicated the structure of chemical substances from X-ray diffraction data, used the blackboard model from speech understanding; the VM system developed a real-time variant of a rule- based system for ventilation management [Fagan et al., 1979]; and the MDX system pursued a distributed conceptual model for diagnosis [Chandrasekaran et al., 1979]. The CENTAUR system [Aikins, 1979] showed that both rules and frames could be usefully combined to represent expert reasoning, and AI/RHEUM [Lindberg, 1980] demonstrated how rule-based systems with specialized medical semantics could be adapted to represent already formalized knowledge found in rheumatologic diagnostic criteria tables. Meanwhile, many other systems were developed showing that rule-based methods could be widely and systematically applied [Reggia et al., 1980; Horn et al., 1980; Speedie et al., 1981; Buchanan and Shortliffe, 1984; Weiss and Kulikowski, 1984].

The evaluation of the first-generation systems was pursued for CASNET [Lichter and Anderson, 1977], MYCIN [Yu et al., 1979], the Digitalis Advisor [Swartout, 1981], and INTERNIST-1 [Pople, 1977]. Critical examination of knowledge structures in INTERNIST-1 led to their being rerepresented in INTERNIST-2 [Pople, 1977]. General methodologies for evaluating clinical predictions were also being developed [Shapiro, 1977]. This testing and evaluation led to the first efforts at automating knowledge acquisition and maintenance of the rule bases. For instance, the TEIRESIAS system was designed to improve the performance of MYCIN through heuristic analysis of individual cases leading to changes in rules that were incorrectly invoked or designed [Davis, 1979].

Likewise, Politakis and Weiss [1980] developed methods for testing and improving EXPERT rule bases by providing logical and statistical performance evaluation and rule-modification tools in an interactive environment. In 1980, a first workshop on general expert systems was held to compare the different techniques for knowledge representation and the systems that implemented them. A common problem was selected in advance without the knowledge of the different research groups and presented to them at the workshop for implementation in their system. The results of this experiment are summarized in Hayes-Roth et al. [1983]. While all the existing shells were able to represent a complex advice-giving problem (the diagnosis and tracking of a chemical spill), those with fixed inference engines proved the easiest to use for rapid prototyping. As might have been expected, the more general AI languages designed to capture expertise did permit more flexible, expressive, and detailed descriptions of the problem but at the cost of a much greater investment of time and effort in building both knowledge bases and specialized inference procedures.

The development and assessment of rule-based and frame-based systems demonstrated that while they could effectively capture large quantities of domain-specific expertise, there were still many unanswered questions about how the expertise should be best applied to specific problems while at the same time abstracting out general representations of knowledge. The complexity of human problem-solving pro­cesses revealed by cognitive analyses and simulations of clinical reasoning [Elstein et al., 1978] pointed to the difficulties of reconciling these goals.

Exploration of Alternative Representations for Medical AI and the Search for Performance—1983-1987

Two almost opposite trends in medical AI emerged from the generalization of expert systems, which continue to the present. On the one hand, driven by the goal of practical clinical applications, researchers tried to adapt existing representations and obtain high performance from them. This led to various systems with very specific goals, such as the interpretation of laboratory tests, which even reached commercial implementation and dissemination [Weiss et al., 1983]. At the opposite end of the spectrum, dissatisfaction with the oversimplified cognitive style of rule-based expert systems and the inadequate coverage and performance of large knowledge bases led to research into alternative representations of deeper medical knowledge [Chandrasekaran et al., 1979].

Explanation and Early Knowledge Level Work in AI Systems

In an attempt to reuse the MYCIN knowledge base for tutorial purposes in the GUIDON system [Clancey, 1989] it was found that generating explanations from the original rule base was not only difficult but also revealed that all kinds of implementational details for consultative inferencing were mixed in with the more abstract, descriptive medical knowledge that supported the inferences. Besides adding metarules for guiding the discourse involved in tutoring, the experience in building GUIDON led Clancey to reconsider the generality of the MYCIN rule base and its representation. The NEOMYCIN system [Clancey and Letsinger, 1981] completely separated descriptive medical knowledge from details of rea­soning task implementation so that it would better match the problem-solving tasks of human experts. This coincided with Newell’s more general observation across AI systems: that they needed to describe problem solving at the knowledge level, free of the encumberances of specific programming details [Newell, 1982]. In medicine, a similar experience had been reported in a follow-up to the digitalis project by Swartout who designed the XPLAIN system [Swartout, 1981] as an automatic programming approach for generating a consultation model from the specification of abstract system goals. Explanations and justifications would then be automatically built into the performance system.

The role of explanatory reasoning became a major focus of attention for many medical AI groups during this period. One of the earliest and major motivations for the AI approaches to reasoning was that they were able to explain their logic in terms that were much closer to the arguments of physicians. However, by the end of the first decade, most AIM researchers found that the explanatory capabilities of their systems, while useful for tracing logical connections through their knowledge, left much to be desired in terms of naturalness and flexibility. Traditional statistical decision systems could produce an explanation by accounting for each manifestation’s contribution to the total probability (or weight) of a diagnosis. In CASNET, an explanation could be generated by tracing the pathway of confirmed causal states that led to the final diagnosis. The confidence in the confirmation of each state could, in turn, be explained by the pattern of observed evidence that supported it. In MYCIN and other derived rule-based systems, explanations could be easily generated by tracing the rules involved in producing a decision. In the Digitalis Advisor, an explanation could be generated in terms of the causal underpinnings of the patient-specific model, while in INTERNIST-1, the diagnostic strategy could be traced in terms of the sequence of manifestations activated or covered by a particular set of hypotheses. All these approaches, while initially satisfying to the developers, upon further consideration proved to be inadequate. Tracing every detail of logic by which a system arrived at a diagnostic or therapeutic conclusion could be very useful for debugging the knowledge base during system developments but it rapidly became tedious to the expert reviewer or the practitioner using the system. What was missing was a clear summary of the underlying justification of the reasoning. A major reason for this was that the first-generation represen­tations, in their emphasis on symbolic and qualitative descriptions of knowledge, as opposed to the earlier numerical representations, failed to separate or abstract out information at the true knowledge level from the finer-grained symbolic details of the implementation.

NEOMYCIN [Clancey and Letsinger, 1981] demonstrated the need to reorganize the knowledge in a rule base in order to be able to produce “intelligently studied reasons” for a decision. The ABEL system showed the critical importance of hierarchical causal descriptions using composition and decomposition operators to explain interactions among subprocesses contributing to an overall pathophysiologic disease process [Patil, 1983]. In a restructuring of the INTERNIST-1 knowledge base, Pople [1982] showed how combinations of very elementary operators (causal, hierarchical) could be used to construct and explain alternative complexes of hypotheses with different interpretations for the same set of confirmed states. MDX stressed the importance of describing problem-solving tasks in diagnosis and using procedural knowledge to capture the knowledge of expert practitioners [Chandrasekaran et al., 1979]. This research, applied to several liver diagnostic problems and a red cell identification problem [Smith et al., 1985], among others, led Chandrasekaran [1986] to develop a theory of generic problem-solving tasks in order to advance our understanding of diagnostic and other expert reasoning processes.

A completely different form of reasoning was implemented in the ATTENDING system [Miller, 1983]. Rather than modeling consultative reasoning directly, Miller analyzed and critiqued the plans that phy­sicians had developed for anesthesia administration, using an augmented transition network (ATN) of states. This novel departure helped define a new type of critiquing system that was very useful for instructional purposes and therefore more likely to be used by physicians who either felt threatened by automated consultative systems or else found them lacking and not essential to their practice.

Performance of Expert Medical Systems

In trying to deal with practical clinical problems, many AIM researchers found that the first general expert systems shells did not satisfy their requirements. The shells frequently had many options that would never be used in a particular application while being difficult to adapt to problem-specific rea­soning methods and domain-specific knowledge structures. Furthermore, for an expert system to be really useful, they found that design of easy-to-use human interfaces, while not scientifically rewarding at first glance, was actually quite essential. In this regard, the next major medical project of the Heuristic Programming Project at Stanford, the ONCOCIN system [Shortliffe et al., 1981], demonstrated that well- engineered interfaces mimicking on a screen, the input of charts, images, and other data for cancer treatment protocols could directly improve data acquisition and were essential in guiding the application of the protocols. Knowledge acquisition for ONCOCIN was provided by a very flexible graphics-oriented system, called OPAL [Musen et al., 1986]. An important representational issue also was settled in the ONCOCIN experiments: the need for event-driven reasoning as well as goal-driven reasoning in rule — based systems, as had been earlier advocated in the EXPERT scheme.

Exploiting other, natural knowledge representations became a theme in an application of the EXPERT system for building a rheumatology knowledge base [Lindberg et al., 1983]. Here it was found that domain-specific decision rules had already been defined by the American Rheumatological Association in the form of diagnostic criteria tables. These, however, needed more detailed elaboration in terms of observational criteria for findings, as well as customization of the diagnostic criteria within a working expert system—AI/RHEUM [Kingsland et al., 1983].

The goal of producing high-performance systems led researchers during this period to experiment with technology transfer, disseminating the knowledge of experts within the medical community. The first expert system on a chip was pioneered at Rutgers with the SPE/EXPERT system for serum protein electrophoresis analysis [Weiss et al., 1983]. In this system, a knowledge base was developed by a leading specialist in the field, using the high-level EXPERT language, and then automatically translated into algorithmic form and compiled into the assembly language of the ROM that existed to process signals from the scanning densitometer of the Cliniscan instrument (TM-Helena Laboratories, Beaumont, Texas). Similar technology was used to develop an advice-giving system for primary eye care on one of the earliest hand-held computers [Kastner et al., 1984].

In a very large undertaking, the INTERNIST-1 knowledge base was recast into a form that would make it more easily accessible by flexible querying in the QMR system [Miller, 1984], rather than being restricted to the consultative mode it had been before. The intelligent retrieval facilities of QMR have made it one of the most widely disseminated results of expert system research [Miller, 1994]. In a similar trend, other earlier non-AI systems were reengineered to have more modular architectures, as with the HELP system, which provided advice as part of the medical information system at the Latter Day Saints Hospital. Its successor, the ILLIAD system, has subsequently become an integral part of quality assurance functions in the hospital network [Lau and Warner, 1992]. Likewise, the DXplain general decision support system was made widely available to physicians through the AMA/NET in the 1980s [Barnett et al., 1987].

With the increasing dissemination of AIM systems came parallel efforts to test and evaluate their performance while exploring new ways of dealing with the difficult issues of reasoning in uncertain environments.

An empirical approach to performance evaluation was taken with the AI/RHEUM system testing and resulted in the first system for refining a rule base using the accumulated experience of expertly solved cases [Politakis and Weiss, 1984]. This system, called SEEK, employed performance heuristics to suggest potential improvements in the rules (generalizations and specializations), which could be tested by incorporation into the model and evaluation over the entire database of cases. In SEEK 2, this work was extended to generate automatically and test the potential improvements, selecting the most successful improvements for inclusion in an updated knowledge base [Ginsberg, 1986]. Related to these approaches is the basic problem of uncovering underlying causal relations through the analysis of time-oriented data, such as was carried out in the RX system [Blum, 1982]. During this period, causality became a major concern of researchers seeking strong theoretical underpinnings that would explain reasoning in complex diagnostic consultation models [Patil, 1986].

Formal Theories of Reasoning Under Uncertainty in Medicine

Many different formal methods of inference were explored and applied in knowledge-based medical decision making. These included the application of multihypothesis Bayesian methods (Ben-Bassat et al.,

, of fuzzy logic as in the CADIAG family of systems [Addlassnig and Kolarz, 1982] and the SPHINX system [Fieschi et al., 1983], or of the Dempster-Shaffer theory of evidence to structured medical prob­lems [Gordon and Shortliffe, 1985]. Reggia et al. [1983] devised set covering strategies to parsimoniously explain diagnostic hypotheses within a probabilistic framework [Peng and Reggia, 1987].

Two major interrelated problems afflicting the handling of uncertainty in clinical decision making were how to combine evidence from multiple related sources in a non-mutually exclusive multihypothesis inference problem and how to represent the flow of uncertainty among intermediate causal and hierar­chically related hypotheses. These problems had existed from the very begining of medical AI, and their formal treatment was avoided through the use of empirical, heuristic methods in the first generation of systems. Now researchers embarked on a more principled search for solutions. A probabilistic approach to propagating the effect of local changes in probabilistic networks was developed by Lauritzen and Spiegelhalter [1987]. The Dempster-Shaffer theory provided an alternative formalism for incorporating partial belief information into a complex network of hypotheses. However, in practice, it often proved computationally expensive or might produce very broad and uninformative confidence bounds on hypotheses at the end of long reasoning chains. In an attempt to provide a link between the semantics of hypotheses related at different levels of causation and abstraction, Pearl [1987] developed methods for propagating uncertainty within a structured Bayesian framework of statistical decision making. The more systematic application of these techniques and their implications for specific medical reasoning problems continues [Pearl, 1995].

The revival of interest in connectionist, neural network methods for learning and decision making was strongly stimulated around this time by the introduction of backpropagation methods [Rumelhard and McClelland, 1986]. These used differentiable nonlinear threshold functions that enabled the learning of much more complex decision rules from data than had been possible with the simpler perceptron methods of the 1960s. As researchers and practitioners came to realize that the expert systems and knowledge-engineering approaches required large amounts of investment by the most costly and indis­pensable consultants in the specialty being modeled, efforts again began to emphasize methods that learned directly from data through artificial neural networks (ANNs), as well as simpler but more explainable structures, such as decision trees [Breiman et al., 1984].

The Past Decade—Structure, Formalism, and Empiricism

Deep Medical Knowledge for Representation and Problem Solving

The trends described above have continued since 1987 and throughout the present decade as researchers continued to work on representations for the deeper medical knowledge that could help explain and justify decision making. One approach has continued to emphasize the centrality of causal reasoning and simulation in providing knowledge-based explanations for the physiologic underpinnings of medical reasoning [Kuipers, 1985; Long, 1987, 1991; Hunter and Kirby, 1991; Widman, 1992].

More generally, Newell’s knowledge-level approach [Newell, 1982] became the inspiration for the second-generation of expert systems, which emphasized the multilevel nature of modeling problem­solving and related interacting knowledge structures [Chandrasekaran and Johnson, 1993].

Applications of Newell’s SOAR architecture [Smith and Johnson, 1993] enabled a very general type of problem-solving mechanism (chunking) to roam over flexible spaces of goals, tasks, and specific knowl­edge structures, thereby providing a common framework for reasoning about decisions in individual cases or with groups of cases, as for learning rules or other changes to the knowledge base. In this, as in the KADS approach [Wielinga et al., 1992], expertise is viewed as a dynamic process that cannot be captured in strictly static knowledge structures but rather evolves as part of an ongoing interactive modeling of systems, agents, problems, and environments [Clancey, 1989]. A discussion of how this problem-solving cognitive approach affects medical decision making can be found in Kleinmuntz [1991]. Meanwhile, many of the lessons learned from earlier prototypes were applied to more sophisticated knowledge-based explanation and system-building schemes [Bylander et al., 1993; Koton, 1993; Swartout and Moore, 1993; Lanzola et al., 1995]. Relationships between abductive reasoning and temporal reason­ing were investigated [Console and Torasso et al., 1991], and an epistemologic framework for structuring medical problem solving stated [Ramoni et al., 1992]. It is interesting to note that a related cognitive approach to eliciting and structuring medical knowledge grew up independently in the former Soviet Union through the work of Gelfand and collaborators [Gelfand et al., 1987; Gelfand, 1989].

Formal Decision-Making Methods and Empirical Systems

Formal decision-making approaches to medical reasoning took off in the 1980s with the founding of the journal, Medical Decision Making, which involved physicians as well as decision scientists in the process of understanding how formal statistical and logical methods can apply to the modeling and interpretation of medical problems. This renewed interest in the modeling of reasoning under risk and uncertainty was motivated by the twin needs of evaluating decision-support systems and of modeling more general epidemiologic effects of medical interventions, with and without computer-based systems being involved. While not directly related to knowledge-based AI approaches, this trend nevertheless interacted with developments in operations research (OR) and AI through the confluence of the influence diagram representation of abstract states for describing processes [Shachter, 1986] and belief network represen­tations for reasoning with causal structured hypothesis networks [Pearl, 1987]. Together, they helped make Bayesian belief networks a practical tool for modeling many of the dynamic aspects of clinical cognition [Cowell et al., 1991] and provided a combined statistical-structural representation for decision­making preferences in computer-based systems [Lehman and Shortliffe, 1990; Farr and Shachter, 1992; Heckerman and Shortliffe, 1992; Jimison et al., 1992; Neapolitan, 1993]. The related use of the decision theoretic approach to integrate knowledge-based and judgmental elements has also been an important development in assisting the evaluation of systems [Cooper, 1993] and knowledge-based tools for spec­ifying large expert system models [Olesen and Andreassen, 1993, Lazola et al., 1995] and building decision models are also becoming more prevalent [Sonneberg et al., 1994]. In addition, tools from meta-analysis have begun to be applied to assess conflicting results from related though different clinical studies [Littenberg and Moses, 1993] and merge diagnostic accuracy results [Midgette et al., 1993]. The temporal nature of medical processes and data has also led to some initial models for decision making that make temporal dependencies explicit in their representation [Kahn, 1991; VanBeek, 1991; Hazen, 1992; Cousins et al., 1993, Shuval and Musen, 1993]. Practical systems in the ICU or other monitoring situations typically have to use a temporal representation in the context of online expert systems [Schecke et al., 1991; Mora et al., 1993; Rutledge et al., 1993; Lehmann et al., 1994, Becker et al., 1997]. Generalized representations of temporal knowledge that enable reasoning with abstractions of event sequences linked to ontologies of problem solving have been developed for a wide variety of medical problems where the time course of an illness is critical [Haimowitz and Kohane, 1995, Long, 1996, Keravnou, 1996, Shahar and Mussen, 1996].

With more sophisticated methods for learning in artificial neural networks (ANNs), the last decade has seen a concomitant increase in their application to the modeling and learning of medical decisions [Barreto and DeAzevedo, 1993; Cherkassy and Lari-Najafi, 1992; Cho and Reggia, 1993; Forsstrom et al., 1991a; Hudson et al., 1991; Sittig and Orr, 1992, Dorfnerr and Porenta, 1994, Stevens et al., 1996]. Logic — based methods have also seen some popularity in structuring and building knowledge bases in medicine, particularly in Europe and Japan [Fox et al., 1990; Krause et al., 1993; Lucas, 1993]. Fuzzy logic methods have also shown their usefulness in a number of monitoring and image interpretation tasks [Becker et al., 1997, Park et al., 1998].

The evaluation of expert systems and decision support systems remains an important ongoing research and practical problem [Miller, 1986; Potoff et al., 1988; Willems et al., 1991; Bankowitz et al., 1992; Nohr, 1994] to which a variety of machine learning techniques can be applied for the performance assessment [Weiss and Kulikowski, 1991], knowledge base refinement [Widmer et al., 1993], and the retrospective analysis of results [Forsstrom et al., 1991b]

Prospects for AI in Medicine—Problems and Challenges

Usefulness of AI Decision Support Systems

As we approach the end of the 1990s, methodological developments in both the representation of knowledge and inferencing methods in medicine, however impressive in terms of their improvements over those of 20 or 30 years ago, still have not resulted in systems that are routinely used in the clinic. Reasons given for this usually include limitations in scope or ability to customize advice, lack of trust in automated advising systems, inadequate timeliness in critical situations, perception of the computer as a competitor, and most importantly, that decision support systems have yet to become indispensable adjuncts in medical practice situations. Problems and opportunities for medical AI in the 1990s have been identified and discussed with varying degrees of optimism or pessimism, depending on the author [Shortliffe, 1993; Uckun, 1993; Coeira, 1996; Kulikowski, 1996]. It has been suggested that the emphasis on decision support is misguided, and should be redirected towards medical education which would empower health care workers [Lillehaug and Lajoie, 1997]. This would tap into the developing theories of clinical cognition and the use of clinical simulations to train and enhance learning at various levels of expertise [Evans and Patel, 1989; Patel et al., 1995]. Some support for such a position comes from a comparative study of four of the most widely disseminated diagnostic decision support systems [Berner et al., 1994], which resulted in an editorial comment to the effect that their overall rating left much to be desired—a grade of C [Kassirer, 1994]. However, the systems reviewed were general diagnostic aids, which most physicians find less than indispensible given their own training and expertise. Another comparative study [Johnston et al., 1994] reported positive contributions to health care in three out of ten decision support systems reviewed. Most of the shortcomings came in systems that did not have to perform complex modeling or numerical computations, where computer-based assistance is often indis­pensable. Fields such as specialized laboratory test interpretation, treatment dosage planning and review, instrumentation monitoring and control, and multimodality imaging all suggest more essential domains for the application of AI methods [Adlassnig and Horak, 1995; Barahona, 1994; DeMelo et al., 1992; Kaye et al., 1995; Kulikowski et al., 1995; Menzies and Compton, 1997, Mora et al., 1993; Siregar and Sinteff, 1996, Taylor et al., 1997].

Opportunities: Knowledge-Bases, Systems Integration and the Web

The great computing success story of the 1990s has been the World Wide Web, and the proliferation of web-based information services. The ease of designing good graphical user interfaces (GUIs) has resulted in a concomitant ease of interaction with our networked machines. E-mail has become ubiquitous, and various standard data interchange formats across platforms have been developed in medicine as in other fields. This presents AI in medical decision-making with a number of challenges and opportunities. The first is to characterize and build the knowledge-bases that can help improve medical decision-making; the second is to integrate them into useful, working, indispensable clinical systems; the third is to validate, standardize, and share medical knowledge as widely and economically as is possible. Development of standard medical vocabularies and nomenclatures; coding schemes, and the Unified Medical Language System (UMLS) [Humphreys et al., 1995], are all essential to the standarization of medical knowledge representation. The availability of large digital image datasets has resulted in the creation of systems and knowledge bases for handling spatial information in anatomy [Brinkley, 1995; Hohne et al., 1995; Rosse et al., 1998], and has been accelerated by the widely disseminated multimodality image data from the Visible Human Project [Ackerman, 1995].

The re-usability of software is greatly facililtated by the development of ontologies for different types of knowledge and problem solving—a classical type of AI research which has received considerable new impetus from the needs of rapid knowledge transfer and scalability of web-based systems. In medicine, the PROTEGE system has become a model for such software re-use experiments [Tu et al., 1995], while other systems based on ontologies have also been developed [Smith et al., 1995, Muller et al., 1996].

Software integration for multiple uses is increasingly spurred by initiatives such as IAIMS [Stead, 1997], and inclusion of useful decision support modules may evolve naturally as productive computing environments become more commonplace. One method for connecting clinical databases to knowledge­bases used in decision support that is gaining acceptance is the Arden syntax [Hripcsak et al., 1990]. Automatic extraction of clinical data from narrative reports is also progressing due to a combination of statistical and linguistic methods [Chute and Yang, 1995; Hripcsak et al., 1996]. Meanwhile, AI systems are being designed to provide patients with intelligent access to medical information [Buchanan et al., 1995].

The present situation, then, can be characterized as one of consolidation in integrating a variety of methods and techniques for different types of medical decision tasks while at the same time involving considerable reassessment and change in terms of expectations about the imminent application of decision-making systems within health care environments. On the one hand, the forthcoming increase in networking of various knowledge sources, digital libraries, and multimedia information opens up a whole new set of opportunities for filtering knowledge to the overwhelmed practitioner, whether in medicine, law, engineering, or other professions. On the other hand, the hard issues of understanding the role of human judgment and responsibility for clinical decisions and actions when using increasingly complex computerized instrumentation and systems presents an ongoing challenge to researchers in medicine, biomedical engineering and computer science, medical informatics, and all other fields essential to the solution of these problems.


Ackerman MJ. 1991. The Visible Human Project. J. Biocommun. 18(2):14.

Adlassnig KP, Kolarz G. 1982. CADIAG-2: Computer-Assisted Medical Diagnosis Using Fuzzy Subsets.

Approximate Reasoning in Decision Analysis, pp. 219-247.

Adlassnig KP, Horak W. 1995. Development and retrospective evaluation of HEPAXPERT-1. A routinely used expert system for interpretive analysis of hepatitis A and B serologic findings. Artif. Intell. Med. 7(1):1.

Aikins J. 1979. Prototype and production rules: An approach to knowledge representation for hypothesis information. In: Proc. 6th Int. Joint Conf. Artificial Intell., pp. 1-3. Tokyo, Japan.

Bankowitz RA, Lave JR, McNeil MA. 1992. A method for assessing the impact of a computer-based decision support system on health care outcomes. Meth. Inform. Med. 31:3.

Barahona P. 1994. A causal and temporal reasoning model and its use in drug therapy applications. Artif. Intell. Med. 6(1):1.

Barnet GO, Cimino JJ, Hupp JA, Hoffer EP. 1987. DXplain: An evolving diagnostic decision-support system. JAMA 258:67.

Barreto JM, DeAzevedo FM. 1993. Connectionist expert systems as medical decision aid. Artif. Intell. Med. 5(6):515.

Becker K., Thull B, Kasmacher-Leidinger H, et al., 1997. Design and validation of an intelligent patient monitoring and alarm system based on a fuzzy logic process model. Artif. Intell. Med. 11(1):33.

Ben-Bassat M, Carlson RW, Pun VK, et al., 1980. Pattern-based interactive diagnosis of multiple disorders: The MEDAS system. IEEE Trans. Pat. Anal. Mach. Intel. (PAMI) 2(2):148.

Berner ES, Webster GD, Shugerman AA, et al., 1994. Performance of four computer-based diagnostic systems. N. Engl. J. Med. 330(25):1792.

Blum R. 1982. Discovery confirmation and incorporation of casual relationships from a large time — oriented data base: The RX project. Comp. Biomed. Res. 15:164.

Breiman L, Friedman J, Olshen R, Stone C. 1984. Classification and Regression Trees. Monterey, CA, Wadsworth.

Brinkley JF, Bradley SW, Sundsten JW, Rosse C. 1997. The Digital Anatomist information system and its use in the generation and delivery of web-based anatomy atlases. Comp. Biomed. Res. 30:472.

Buchanan BG, Sutherland G, Feigenbaum EA. 1970. Rediscovering some problems of artificial intelligence in the context of organic chemistry. In: B Meltzer, D Michie (Eds.), Machine Intelligence, pp. 209­254. Edinburgh, Edinburgh University Press.

Buchanan BG, Shortliffe EH. 1984. Rule-Based Expert Systems: The MYCIN Experiments in the Stanford Heuristic Programming Project. Reading, MA, Addison-Wesley.

Buchanan BG, Moore JD, Forsythe DE, et al., 1995. An intelligent interactive system for delivering individualized information to patients. Artif. Intell. Med. 7(2) 117.

Bylander T, Weintraub M, Simon SR. 1993. QUAWDS: Diagnosis using different models for different subtasks. In: J-M David et al., (Eds.), Second Generation Expert Systems, pp. 110-130. Berlin, Springer-Verlag.

Chandrasekaran B, Gomez F, Mittal 5, Smith J. 1979. An approach to medical diagnosis based on conceptual schemes. In: Proc. 6th Int. Joint Conf. Artif. Intell., pp. 134-142. Tokyo, Japan.

Chandrasekaran B. 1986. Generic tasks in knowledge-based reasoning. High-level building blocks for expert system design. IEEE Expert Intelligent Systems and Their Applications 1(3):23.

Chandrasekaran B, Johnson TR. 1993. Generic tasks and task structures: History, critique and new directions. In: JM David, et al., (Eds.), Second Generation Expert Systems, pp. 232-272. Berlin, Springer-Verlag.

Cherkassky V, Lari-Najafi H. 1992. Data representation for diagnostic neural networks. IEEE Expert 7(5):43.

Cho S, Reggia JA. 1993. Multiple disorder diagnosis with adaptive competitive neural networks. Artif. Intell. Med. 5(6):469.

Chute CG, and Yang Y. 1995. An overview of statistical methods for the classification and retrieval of patient events. Methods Med. Inf. 34(1): 104.

Ciesielski V (Ed.). 1978. Proceedings of the Fourth Annual AIM Workshop, Rutgers University Technical Report.

Clancey WJ. 1979. Tutoring rules for guiding a case method dialogue. Int. J. Man-Machine Stud. 11:25.

Clancey WJ, Letsinger R. 1981. NEOMYCIN: Reconfiguring a rule-based expert system for application to teaching. In: Proc. 7th Int. Joint Conf. Artif. Intell., pp. 829-836.

Clancey WJ. 1989. The knowledge level reinterpreted: Modeling how systems interact. Mach. Learn. 4:285.

Clarke K, O’Moore R, Smeets R, et al., 1994. A methodology for evaluation of knowledge-based systems in medicine. Artif. Intell. Med. 6(2):107.

Console L, Torasso P. 1991. On the co-operation between abductive and temporal reasoning in medical diagnosis. Artif. Intell. Med. 3(6):291.

Cooper G 1993. Probabilistic and decision-theoretic systems in medicine. Artif. Intell. Med. 5(4):289.

Cousins SB, Chen W, Frisse ME. 1993. A tutorial introduction to stochastic simulation algorithms for belief networks. Artif. Intell. Med. 5(4):315.

Cowell RG, Dawid AP, Hutchinson T, Spiegelhalter DJ. 1991. A Bayesian expert system for the analysis of an adverse drug reaction. Artif. Intell. Med. 3(5):257.

Das AK and Musen MA. 1994. A temporal query system for protocol-directed decision support. Meth. Inf. Med. 33:358.

Davis R. 1979. Interactive transfer of expertise: Acquisition of new inference rules. Artif. Intell. 12:121. DeMelo AS, Grarner J, Bronzino JD. 1992. SPINEX: An expert system to recommend the safe dosage of spinal anesthesia. In: Proc. 14th Ann. Int. Conf. IEEE Eng. in Med. and Biol. Soc., pp. 902-903. Paris, France.

Duda RO, Hart PE, Nilsson NJ. 1976. Subjective Bayesian methods for rule-based inference systems. In: Proc. Natl. Comput. Conf., New York.

Elstein AS, Shulman LS, Spratka SA. 1978. Medical Problem Solving: An Analysis of Clinical Reasoning.

Cambridge, MA, Harvard University Press.

Englemore RS, Terry A. 1979. Structure and function of the Crysalis system. In: Proc. Sixth Int. Joint Conf. Artif. Intell., pp. 250-256.

Evans DA and Patel VL. (Eds.). 1989. Cognitive Science in Medicine. Cambridge, MA, MIT Press.

Fagan LM, Kunz JC, Feigenbaum EA, Osborn JJ. 1979. Representation of dynamic clinical knowledge: Measurement interpretation in the intensive care unit. In: Proc. 6th Int. Joint Conf. Artif. Intell., pp. 250-256.

Farr BR, Shachter RD. 1992. Representation of preferences in decision-support systems. Comp. Biomed. Res. 25(4):324.

Feigenbaum EA. 1978. The art of artificial intelligence: Themes and case studies of knowledge engineering.

In: Proc. Nail Comput. Conf., p 221. New York.

Fieschi M, Joubert M, Fieschi D, et al., 1983. A production rule expert system for medical consultations. Medinfo 83:503.

Forsatrom J, Eldund P, Virtanen H, et al., 1991a. DIAGAID: A connectionist approach to determine the diagnostic value of clinical data. Artif. Intell. Med. 3(4): 193.

Forsstrom J, Nuutila P, lijala K. 1991b. Using the ID3 algorithm to find discrepant diagnoses from laboratory databases of thyroid patients. Med. Decis. Mak. 11(3): 171.

Fox J, Glowinski A, Gordon C, et al., 1990. Logic engineering for knowledge engineering: Design and implementation of the Oxford system of medicine. Artif. Intell. Med. 2(6):323.

Freiher G. 1979. The seeds of artificial intelligence: SUMEX-AIM. In: Div Res. Res., pp. 80-2071. NIH. Gelfand IM, Rosenfeld BI, Shifrin A. 1987. Data Structuring in Medical Problems. Moscow, USSR Academy of Sciences.

Gelfand IM. 1989. Two archetypes in the psychology of man. Kyoto Prize Lecture.

Ginsberg A, Weiss SM, Polikakis P. 1985. Seek-2: A generalized approach to automatic knowledge base refinement. In: Proc. 9th Int. Joint Conf. Artif. Intell.

Ginsberg A. 1986. A metalinguistic approach to the construction of knowledge base refinement systems. Proc. AAAI 86:436.

Gordon J, Shortliffe EH. 1985. The Dempster-Shaffer theory of evidence in rule based expert systems. In: BG Buchanan, EH Shortliffe (Eds.), Rule-Based Expert Systems, pp. 272-292. Reading, MA, Addison Wesley.

Gorry GA. 1970. Modeling the diagnostic process. J. Med. Ed. 45:293.

Gorry GA, Silverman H, Pauker SG. 1978. Capturing clinical expertise: A computer program that considers clinical responses to digitalis. Am. J. Med. 64:452.

Haimowitz IJ, Le PP, and Kohane IS. 1995. Clinical monitoring using regression-based trend templates.

Artif. Intell. Med. 7(6):473.

Hart PE, Duda RE. 1977. PROSPECTOR—A computer-based consultation system for mineral explora­tion. SRI Tech. Rep. 155.

Hayes-Roth F, Waterman D, Lenat D. 1983. Building Expert Systems. Reading, MA, Addison Wesley.

Hazen GB. 1992. Stochastic trees: A new technique for temporal medical decision modeling. Med. Decision Making 12(3):163.

Heckerman DE, Shortliffe EH. 1992. From certainty factors to belief networks. Artif. Intell. Med. 4(1):35.

Heckerman DE, Horvitz EJ, and Nathwani BN. 1992. Toward normative expert systems. Part I—The Pathfinder system. Methods Inf. Med. 31 (2): 90.

Hohne KH, Pflesser B, Pommert A, et al., 1995. A new representation of knowledge concerning human anatomy and function. Nat. Med. 1:506.

Horn W, Buchstaller W, Trappl R. 1980. Knowledge structure definition for an expert system in primary medical care. In: Proc. 7th Intl. Joint Conf. Art. Intell., pp. 850-852.

Hripcsak G, Johnson SB, Clayton PD. 1993. Desperately seeking data: Knowledge base-database links. In: Proc. 17th Ann. Symp. Comp. App. Med. Care, 639.

Hripcsak G, Friedman C, Alderson PO, et al., 1995. Unlocking clinical data from narrative reports: A study of natural language processing. Ann. Intern. Med. 122:681.

Hudson DL, Cohen ME, Anderson ME 1991. Use of neural network techniques in a medical expert system. Int. J. Intell. Syst. 6:213.

Humphreys B, Lindberg DAB, Schoolman HM, and Barnett O. 1998. The Unified Medical Language System. An informatics research collaboration. J. Amer. Med. Inform. Assoc. 5(1): 1.

Hunter J, Kirby I. 1991. Using quantitative and qualitative constraints in models of cardiac electrophys­iology. Artif. Intell. Med. 3:41.

Johnston ME, Langton KB, Haynes RB, Mathieu A. 1994. Effects of computer-based clinical decision — support systems on clinician performance and patient outcome. A critical appraisal of research. Ann. Intern. Med. 120: 135.

Jimison HB, Fagan LM, Shachter RD, Shortliffe EH. 1992. Patient-specific explanation in models of chronic disease. Artif. Intell. Med. 4(3): 191.

Kahn MG. 1991. Modeling time in medical decision-support programs. Med. Decision Making 11(4):249.

Kahn MG, Fagan LM, Sheiner LB. 1989. Model-based interpretation of time-varying medical data. In: Proc. 13th Ann. Symp. Comp. Appl. Med. Care, pp. 28-32. Washington, IEEE Computer Society Press.

Kassirer IP. 1994. A report card on computer-assisted diagnosis—The grade: C. N. Engl. J. Med. 330(25): 1824.

Kassirer JP, Gorry GA. 1978. Clinical problem solving: A behavioral analysis. Ann. Intern. Med. 89:245.

Kastner JK, Dawson CR, Weiss SM. et al., 1984. An expert consultation system for frontline health workers in primary eye care. J. Med. Syst. 8:389.

Kaye J, Primiano FP and Metaxas D. 1995. Anatomical and physiological simulation for respiratory mechanics. J. Imig. Guid. Surg. 1(3): 164.

Keravnou ET. 1996. Temporal diagnostic reasoning based on time-objects. Artif. Intell. Med. 8(3):235.

Kim JH, Pearl J. 1987. CONVINCE: A conversational inference consolidation engine. IEEE Trans. Syst. Man. Cybernet. 17:120.

Kingsland L, Sharp G, Capps R, et al., 1983. Testing of a criteria-based consultant system in rheumatology. Medinfo. p. 514.

Kleinmuntz B, McLean RS. 1968. Diagnostic interviewing by digital computer. Behav. Sci. 13:75.

Kleinmuntz B. 1991. Computers as clinicians: An update. Comput. Biol. Med. 22(4):227.

Koton PA. 1993. Combining causal models and case-based reasoning. In: J-M David et al., (Eds.), Second Generation Expert Systems, pp. 69-78. Berlin, Springer-Verlag.

Krause P, Fox J, O’Neil M, Glowinski A. 1993. Can we formally specify a medical decision support system? IEEE Expert 8(3):56.

Kuipers BJ. 1985. The limits of qualitative simulation. In: M Kaufmann (Ed.), Proc. 9th Int. Joint Conf. Artif. Intell., pp. 128-136. Los Altos, Calif.

Kuipers BJ. 1987. Qualitative simulation as causal explanation. IEEE Trans. Syst. Man. Cybernet. 17:432.

Kulikowski CA, Weiss 5. 1972. Strategies of database utilization in sequential pattern recognition. Proc. IEEE Conf. Decision Control, 103.

Kulikowski CA. 1980. Artificial intelligence methods and systems for medical consultation. IEEE Trans.

Pat. Anal. Mach. Intel. (PAMI) 2(5):464.

Kulikowski CA, Ostroff J. 1980. Constructing an expert knowledge base for thyroid disease using gener­alized AI techniques. In: Proc. 4th Ann. Symp. Comp. Health Care, pp. 175-180.

Kulikowski CA, Gong L, and Mezrich RS. 1995. Knowledge-based medical image analysis for integrating context definition with the radiological report. Methods Med. Inf. 34(1): 96.

Kulikowski CA. 1996. AIM. Quo Vadis? J. Amer. Med. Inform. Assoc. 3(6):432.

Lanzola G, Quaglini S, Stefan elli M. 1995. Knowledge-acquisition tools for medical knowledge-based systems. Methods Inf. Med. 34(1): 25.

Lau LM, Warner HR. 1992. Performance of a diagnostic system (Iliad) as a tool for quality assurance.

Comp. Biomed. Res. 25(4):314.

Lauritzen SL, and Spiegenhalter DJ. 1987. Local computations with probabilities and their applications to expert systems. JR Stat. Soc. (Series B) 50: 157.

Lehmann ED, Deutsch T, Carson ER, Sonksen PH. 1994. Combining rule-based reasoning and mathe­matical modelling in diabetes care. Artif. Intell. Med. 6(2):137.

Lehmann HP, Shortliffe EH. 1990. Thomas: Building Bayesian statistical expert systems to aid in clinical decision making. In: Proc. 14th Symp. Applic. Med. Care, pp. 58-64. New York, IEEE Press.

Lichter P, Anderson D. 1977. Discussions on Glaucoma. New York, Grune and Stratton.

Lillehaug SI, and Lajoie SP. 1998. AI in medical education—another grand challenge for medical infor­matics. Artif. Intell. Med. 12:197.

Lindberg DA. 1980. Computer based rheumatology consultant. Medinfo. 80:1311.

Lindberg DA, Sharp GC, Kay DR, et al., 1983. The expert consultant as teacher. Moebius 3:30.

Littenberg B, Moses LE. 1993. Estimating diagnostic accuracy from multiple conflicting reports. Med.

Decision Making. 13(4):313.

Long WJ. 1987. The development and use of a causal model for reasoning about heart failure. In: Proc.

11th Symp. Comp. Applic. Med. Care, pp. 30-36.

Long WJ. 1991. Flexible reasoning about patient management using multiple methods. Artif. Intell. Med. 3:3.

Long WJ, Fraser H, Naimi S. 1997. Reasoning requirements for diagnosis of heart disease. Artif. Intell. Med. 10(1):5.

Lucas PIF. 1993. The representation of medical reasoning models in resolution-based theorem provers.

Artif. Intell. Med. 5(5):395.

Menzies T, and Compton P. 1997. Applications of abduction. Hypothesis testing of neuroendocrinological qualitative compartmental models. Artif. Intell. Med. 10:145.

Mora FA, Passariello G, Carrault G, LePichon JP. 1993. Intelligent patient monitoring and management systems: A review. IEEE Eng. Med. Biol. 12:23.

Midgette AS, Stuiel TA, Littenberg B. 1993. A meta-analytic method for summarizing diagnostic test performances. Med. Decis. Mak. 13(3):253.

Miller PL. 1983. Attending: Critiquing a physician’s management plan. IEEE Trans. Pat. Anal. Mach. Intel. (PAMI) 5:449.

Miller PL. 1986. The evaluation of artificial intelligence systems in medicine. Comput. Meth. Prog. Biomed. 22:5.

Miller PL. 1998. Tools for immunization guideline knowledge maintenance I. Automated generation of the logic “kernel” for immunization forecasting. Comp. Biomed. Res. 31(3): 172.

Miller RA, Pople HE, Myers JD. 1982. An experimental computer based diagnostic consultant for general internal medicine. N. Engl. J. Med. 307:468.

Miller RA. 1984. Internist-1/Caduceus: Problems facing expert consultant programs. Meth. Inform. Med. 23:9.

Miller RA. 1994. Medical diagnostic decision support systems: Past, present, and future: A threaded bibliography and commentary. J. Am. Med. Info. Assoc. 1(1):8.

Minsky N. 1975. A framework for representing knowledge. In: P Winston (Ed.), The Psychology of Computer Vision. New York, McGraw-Hill.

Muller R, Thews O, Rohrbach C, et al., 1996. A graph-grammar approach to represent causal, temporal, and other contexts in an oncological patient record. Methods Inf. Med. 35:137.

Musen MA, Fagen LM, Combs DM, Shortliffe EH. 1986. Facilitating knowledge entry for an oncology therapy advisor using a model of the application area. Medinfo 86:46.

Neapolitan RE. 1993. Computing the confidence in a medical decision obtained from an influence diagram. Artif. Intell. Med. 5(4):341.

Newell A. 1982. The knowledge level. Artif. Intell. 18:87.

Nii HP, Aiello N. 1979. AGE: A knowledge-based program for building knowledge-based programs. In: Proc. 6th Int. Joint Conf. Artif. Intell., pp. 645-655. Tokyo, Japan.

Nohr C. 1994. The evaluation of expert diagnostic systems—How to assess outcomes and quality param­eters? Artif. Intell. Med. 6(2):123.

Olesen KG, Andreassen S. 1993. Specification of models in large expert systems based on causal proba­bilistic networks. Artif. Intell. Med. 5(3):269.

Park W, Hoffman EA, Sonka M. Segmentation of intrathoracic airway trees: A fuzzy logic approach. IEEE Trans. Med. Imag. 17(4):489.

Patel VL, Kaufman DR, and Arocha JF 1995. Steering through the murky waters of a scientific conflict. Situated and symbolic models of clinical cognition. Artif. Intell. Med. 7(5):413.

Patil, RS, Szolovitz, P, Schwartz WB. 1981. Causal understanding of patient illness in medical diagnosis. In: Proc. 7th Int. Joint Conf. Artif. Intell, pp. 893-899.

Patil R. 1983. Role of causal relations in formulation and evaluation of composite hypotheses. IEEE Med. Comp.

Patil R. 1986. Review of causal reasoning in medical diagnosis. In: Proc. 10th SCAMC, pp. 11-16.

Pauker SG, Gorry GA, Kassirer JP, Schwartz WB. 1976. Towards the simulation of clinical cognition: Taking a present illness by computer. Am. J. Med. 60:981.

Pearl J. 1987. Probabilistic Reasoning in lntelligent Systems: Networks or Plausible Inference. San Mateo, CA, Morgan Kaufmann.

Pedrycz W, Bortolan G, Degani R. 1991. Classification of electrocardiographic signals: A fuzzy pattern matching approach. Artif. Intell. Med. 3(4):21 1.

Peng Y, Reggia JA. 1987. A probabilistic causal model for diagnostic problem solving: Integrating symbolic causal inference with numeric probabilistic inference. IEEE Trans. Syst. Man. Cybernet. 17:146.

Politakis P, Weiss SM. 1980. A system for empirical experimentation with expert knowledge. In: Proc. 15th HICCS, pp. 675-683.

Politakis P, Weiss SM. 1984. Using empirical analysis to refine expert system knowledge bases. Artif. Intell. 22:23.

Poon AD, Fagan LM, Shortliffe EH. 1996. The PEN-Ivory project. Exploring user-interface design for selection of items from large controlled vocabularies of medicine. J. Am. Med. Inform. Assoc. 3:168.

Pople HE. 1973. On the mechanization of abductive logic. In: Proc. Int. Joint Conf. Artif. Intell., pp. 147-182.

Pople H, Myers J, Miller R. 1975. DIALOG: A model of diagnostic logic for internal medicine. In: Proc. 4th Int. Joint Conf. Artif. Intell., pp. 848-855.

Pople H. 1977. The formation of composite hypotheses in diagnostic problem solving. In: Proc. 5th Int. Joint Conf. Artif. Intell., pp. 1030-1037.

Pople H. 1982. Heuristic methods for imposing structure on ill-structured problems: The structuring of medical diagnoses. In: P Szolovits (Ed.), Artificial Intelligence in Medicine. Boulder, CO, Westview Press.

Potthoff P, Rothemund M, Schwefel D, et al., 1988. Expert Systems in Medicine. Cambridge, England, Cambridge University Press.

Ramoni M, Stefanelli M, Magnani L, Barosi G. 1992. An epistemological framework for medical knowl­edge-based systems. IEEE Tran. Sys. Man. Cybernet. 22(6):1361.

Reggia JA, Pula TP, Price TR, Perricone BT. 1980. Towards an intelligent textbook of neurology. In: Proc.

4th Ann. Symp. Comp. Appl. Med. Care, pp. 190-199. Washington.

Reggia JA, Nau DS, Wang PY. 1983. Diagnostic expert systems based on a set covering model. Int. J. Man — Machine Stud. 19:437.

Rosse C, Mehino, JL, Modayur BR et al., 1998. Motivation and organizational principles for anatomical knowledge representaiton: The Digital Anatomist symbolic knowledge base. J. Amer. Med. Inf. Assoc. 5(1):17.

Rumelhart DE, McClelland JL. 1986. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA, MIT Press.

Rutledge GW, Thomsen GE, Farr BR, et al., 1993. The design and implementation of a ventilator man­agement advisor. Artif. Intell. Med. 5(1):67.

Schecke TH, Rau G, Popp HI, et al., 1991. A knowledge-based approach to intelligent alarms in anesthesia.

IEEE Eng. Med. Biol. 10:38.

Schwartz WB. 1970. Medicine and the computer: The promise and problems of change. N. Engl. Med. 283:1257.

Shachter RD. 1986. DAVID: Influence diagram processing system for the Macintosh. In: Proc. Workshop on Uncertainty in Artificial Intelligence, AAAI, Philadelphia, PA, pp. 311-318.

Shahar Y, and Mussen MA. 1993. RESUME. A temporal-abstraction system for patient monitoring. Comp. Biomed. Res. 26:255.

Shapiro AR. 1977. The evaluation of clinical predictions: A method and initial application. N. Engl. Med. 296:1509.

Shortliffe EH, Axline SG, Buchanan BG, Cohen SN. 1974. An artificial intelligence program to advise physicians regarding antimicrobial therapy. Comput. Biomed. Res. 6:544.

Shortliffe EH. 1976. Computer-Based Medical Consultation: MYCIN. New York, Elsevier.

Shortliffe EH, Scott AC, Bitchoff MB, et al., 1981. An expert system for oncology protocol management.

In: Proc. 7th Int. Joint Conf Artif. Intell, pp. 876-881.

Shortliffe EH. 1993. The adolescence of AI in medicine: will the field come of age in the 1990s? Artif. Intell. Med. 5:93.

Sirengar P, and Sinteff J-P. 1996. Introducing spatio-temporal reasoning into the inverse problem in electroencephalography. Artif. Intell. Med. 8(2):97.

Sittig DF, Orr IA. 1992, A parallel implementation of the backward error propagation neural network training algorithm: Experiments in event identification. Comput. Biomed. Ret. 25(6):547.

Smith JW, Svirbely J, Evans C, et al., 1985. RED: A red-cell antibody identification expert module. J. Med. Syst. 9(3):121.

Smith JW, Johnson TR. 1993. A stratified approach to specifying, designing, and building knowledge systems. IEEE Expert 8(3).

Smith JW, Bayazitoglu A, Johnson TR, et al., 1995. One framework, two systems. Flexible abductive methods in the problem-space paradigm applied to antibody identification and biopsy interpre­tation. Artif. Intell. Med. 7(3):201.

Sonnenberg FA, Hagerty CG, Kulikowski CA. 1994. An architecture for knowledge based construction of decision models. Med. Decis. Mak. 13(30):27.

Speedie SM, Palumbo FB, Knapp DA, Beardsley R. 1981. Rule-based drug prescribing review: An oper­ational system. In: Proc. 5th Ann. Symp. Comp. Appl. Med. Care, pp. 598-602. Washington.

Stead WW. 1997. The evolution of the IAIMS: Lessons from the next decade. J. Amer. Med. Inform. Assoc. 4(2):S4

Stevens RH, Lopo AC, Wang P. 1996. Artificial neural networks can distinguish novice and expert strategies during complex problem solving. J. Am. Med. Inf. Assoc. 3(2): 131.

Summers R, Carson ER, Cramp DG. 1993. Ventilator management. IEEE Eng. Med. Biol. 12:50.

Swartout WR. 1981. Explaining and justifying expert consulting programs. In: Proc. 7th Int. Joint Conf. Artif. lntell, pp. 815-822.

Swartout WR, Moore ID. 1993. Explanation in Second Generation Expert Systems. In: J-M David et al.

(Eds.), Second Generation Expert Systems, pp. 543-615. Berlin, Springer-Verlag.

Szolovits P, Pauker SG. 1978. Categorical and probabilistic reasoning in medical diagnosis. Artif. Intell. 11:115.

Taylor P, Fox J, and Todd-Pokropek A. 1997. A model for integrating image processing into decision aids for diagnostic radiology. Artif. Intell. Med. 9(3):205.

Tu SW, Eriksson H, Gennari JH et al., 1995. Ontology-based configuration of problem-solving methods and generation of knowledge-acquisition tools. Application of PROTEGE II to protocol-based decision support. Artif. Intell. Med. 7(3): 257.

Uckun S, Dawant BM, Lindstrom DP. 1993. Model-based diagnosis in intensive care monitoring: The YAQ approach. Artif. Intell. Med. 5(1)31.

Uckun S. 1993. Artificial intelligence in medicine: state of the art and future prospects. Artif. Intell. Med. 5:89.

VanBeck P. 1991. Temporal query processing with indefinite information. Artif. Intell. Med. 3(6)325. van Melle W. 1979. A domain independent production-rule system for consultation programs. In: Proc.

6th Int. Joint Conf. Artif. Intell., pp. 923-925, Tokyo, Japan.

Wagner MM, Pankaskie M, Hogan W, et al., 1997. Clinical event monitoring at the University of Pitts­burgh. Proc. AMIA Fall Symp. 188.

Weiss SM. 1974. A system for model-based computer-aided diagnosis and therapy. Thesis, Rutgers University. Weiss S, Kulikowski C, Amarel S, Safir A. 1978. A model-based method for computer-aided medical decision-making. Artif. Intell. 11:145.

Weiss S, Kulikowski C. 1979. EXPERT: A system for developing consultation models. In: Proc. 6th Int.

Joint Conf. Artif. Intell., pp. 942-950. Tokyo, Japan.

Weiss SM, Kulikowski CA, Galen RS. 1983. Representing expertise in a computer program: The serum protein diagnostic program. J. Clin. Lab. Automation 3:383.

Weiss SM, Kulikowski CA. 1984. A Practical Guide to Designing Expert Systems. Totowa, Rowman and Allenheld.

Weiss SM, Kulikowaki CA. 1991. Computer Systems That Learn. San Mateo, Calif., Morgan Kaufniano. Widman LE. 1992. A model-based approach to the diagnosis of the cardiac arrhythmias. Artif. Intell. Med. 4(l):1.

Widmer G, Horn W, Nagele B. 1993. Automatic knowledge base refinement: Learning from examples and deep knowledge in rheumatology. Artif. Intell. Med. 5(3):225.

Wielinga BJ, Schreiber AT, Breuker JA. 1992. KADS: A modelling approach to knowledge engineering.

Knowledge Acquisition 4:5.

Willems JL, Abreu-Lima C, VanBemmel AP, et al., 1991. The diagnostic performance of computer pro­grams for the interpretation of electrocardiograms. N. Engl. J. Med. 325:1767.

Yu VL, Buchanan BG, Shortliffe EH, et al., 1979. An evaluation of the performance of a computer-based consultant. Comput. Progr. Biomed. 9:95.

Ner, D. A., Micheli-Tzanakou, E. “Artificial Neural Networks: Definitions, Methods, Applications.” Biomedical Engineering Handbook: Second Edition.

Joseph D. Bronzino

;a Raton: CRC Press LLC, 2000

Design Issues in Developing Clinical Decision Support and Monitoring Systems

John W. Goethe

The Institute of Living

Joseph D. Bronzino

Trinity College/Biomedical Engineering Alliance for Connecticut (BEACON)

подпись: john w. goethe
the institute of living
joseph d. bronzino
trinity college/biomedical engineering alliance for connecticut (beacon)
Design Recommendations

Description of a Clinical Monitoring System

Outcome Assessment


As discussed in previous chapters, health care facilities presently use computers to support a wide variety of administrative, laboratory, and pharmacy activities. However, few institutions have installed computer systems that provide ongoing monitoring of care and assist in clinical decision making. Despite some promising examples of the use of artificial intelligence to enhance patient care, very few products are routinely used in clinical settings [Morelli et al., 1987; Shortliffe & Duda, 1983]. One of the important factors that that has limited the acceptance of decision support tools for clinicians is the lack of medical staff input in system development. Thus, a major task for designers of computer systems that are to be incorporated into the daily clinical routine is to involve practitioners in the design process. This chapter presents a set of design recommendations, the goal of which is to ensure end-user acceptance, and describes a comprehensive clinical system that was designed using this approach and is now in routine use in a large psychiatric hospital.

Design Recommendations

The design phase of any initiative requires attention to the social and organizational context in which the new product or program will exist. Although a full discussion of the issue is beyond the scope of this chapter, it has been extensively covered in a number of texts. (See especially Peters and Tseng [1983] for a theoretical overview of the theory and illustrative case studies.) In addition to these general principles of organizational change, several specific design steps are critical to end-user acceptance.

Establish a Project Team. The composition of the project team should reflect the scope of the activities within the facility and the organizational structure (hierarchy). It should include at least one outside consultant and at least one member of the medical/professional staff who will serve as the liaison to a clinician task force representing all practitioners.

Establish a Clinician Task Force. In addition to the project team, which represents the entire institu­tion, it is important to form a clinician task force of five to seven members of the medical/professional staff. In some facilities all care is provided by physicians while in other settings psychologists, social workers, nurses, and other professionals function as primary clinicians. The task force must represent all these professionals.

Meetings of the task force must begin well before the hardware and software decisions have been made so that there will be adequate time to incorporate suggestions from the clinical staff. The implementation schedule must allow time for resolution of the issues raised by the end-users. (Resolution here does not imply that all parties will be pleased with all decisions, but the design team should ensure adequate time for discussion of each issue raised by the clinical staff and for reaching closure.)

Know the Limits within Which the Project Team Must Work. Part of the design task is to determine the limitations present and the level of expectation for system use. For example, is the number of terminals and their location sufficient for all clinicians to have easy access? Does the administration expect clinicians to type rather than dictate patient summaries? If an administrator is not on the project team, interviews with the key executives should be held as soon as possible, with follow-up meetings regularly scheduled.

Identify All Institutional Initiatives and Needs. It is important for system designers to be aware of all institutional plans. For example, the facility may have recently modified its clinical documentation procedures in response to a change in reimbursement regulations or as part of a new managed care contact. Many of these changes will have an impact on the system design, and some may create oppor­tunities for cost or time savings through automation.

Identify the Types of Patient Care Activities Typical for the Institution. Using the language/action approach of Winograd [1987] one can classify all clinical care activities within an institution. As discussed by Morelli, and coworkers [1993] all actions in a psychiatric hospital can be grouped into six categories: assessment notation, medication order, nonmedication order, notification of a critical event, request by the clinician for additional information (consultation), and request of the clinician for additional infor­mation (e. g., to justify a treatment or hospital admission). This list, while not necessarily applicable to all clinical settings, is useful as a template. (See Lyytinen [1987] for a review of other approaches to understanding the component activities within an organization.) Once all possible clinical actions are categorized, list those that the system is intended to support. Such detailed assessment of the structure and nature of clinician actions allows designers to determine prior to implementation how the system will affect existing practices. For example, will clinicians be expected to use the computer to enter orders, eliminating the need for nurse transcription of handwritten orders? Will clinicians have to respond to computer notices and document why a treatment standard was not followed?

Illustrate How the New System Will Be Integrated into the Existing Environment. Computerization of one aspect of clinical care may have an unexpected impact on other areas of practice. Thus, a detailed map of the interactions involved in formulating and executing medical orders is a useful step [Morelli et al., 1993]. Even desirable changes in the clinical routine, such as reducing handwritten progress notes, must be examined since the existing nonautomated methods may have served some additional (e. g., educational) purpose not evident without careful mapping of the interaction. Both automated and manual procedures should appear logical and relevant to the staff. Screen formats and input/retrieval features, for example, must follow practices acceptable to the facility. Interactions with the computer must be viewed as nonintrusive and must not require extensive data entry by clinicians.

Clinician Interaction with the Terminal Should Be Part of Daily Clinical Activity. A computer sys­tem should not be a stand-alone tool; rather, it should be fully integrated into the clinical routine. For example, clinicians should have direct and easy access to on-line displays of current medications and results of laboratory tests. Furthermore, clinicians should be able to query the system for additional information when laboratory or other data indicate a potential problem.

The System Should Provide Real-time Feedback to Clinicians. The timeliness of an information sys­tem is a critical factor in health care applications. Although all functions do not have to operate literally in real time, they do have to operate in a manner consistent with clinician practice. For example, the system should notify clinicians immediately of events such as drug-drug interactions, but in other situa­tions (e. g., a reminder that a follow-up test is due) a notification within 2 days may be sufficient. Laboratory results and treatment data must reflect the orders entered as of the day the system is being queried.

Conduct Site Visits to Assess Computer Use in the Delivery of Clinical Service. Important aspects of system design may be overlooked if unique characteristics of a given site are not taken into account. For example, the system designers may plan to install a computer terminal in the area from which medications are dispensed in each nursing station. If on one unit, because of size or location, this area cannot accommodate a standard terminal, an alternative device may have to be purchased or architectural changes made.

Include Support for Key Administrative as well as Clinical Care Tasks. Administrative activities directly related to clinician practice are quality assurance, peer review, continuing medical education (including information dissemination and needs assessment), drug utilization evaluations, patient outcomes assess­ment, and continuous quality improvement programs. Since both administrative and clinical tasks are intended to serve a common purpose, the delivery of the best available services to the patient, they should be integrated whenever possible. A detailed listing of all such tasks, specifying the structure of each and the individuals involved, may identify important areas in which the clinical information system can be usefully applied.

Evaluate the System Prior to Implementation. Once the design is complete, an evaluation step is nec­essary to test the utility of the system in an actual clinical practice.

The System Must Be Adaptable to Individual Clinician and Program Needs. Since rapid changes are commonplace in health care, the system must be able to be modified and updated without extensive additional programming.

Description of a Clinical Monitoring System

The Institute of Living’s Clinical Evaluation and Monitoring System (CEMS) was developed to provide comprehensive support for clinical services in a psychiatric hospital. It represents an extension of earlier work that resulted in a prototype psychopharmacology monitoring system described elsewhere [Bronzino et al., 1989] The current system provides decision support and/or automated monitoring for each key component of care (assessment, diagnosis, and treatment) and consists of four modules: treatment standards (“pharmacotherapy guideline”), diagnostic checklists (DCLs), information alerts, and outcome assessment. Table 180.1 shOws the components of care that are supported by each of the four modules of the CEMS.

The treatment standards are accessed from the menu available to all clinical staff. This document summarizes key information about the use of selected psychiatric medications (dosages, therapeutic serum levels, indications, and side effects) and presents decision trees to guide drug selection. The manual is continuously updated and serves as both a reference and a vehicle for dissemination of new information.

TABLE 180.1 Clinical Evaluation and Monitoring System (CEMS)



Components of Care

Assessment Pt. History/Symptoms






Treatment Standards








Information alerts





Outcome assessment



*DCLs = diagnostic checklists

It is not linked electronically to the other modules, but the content is the database for the decision rules in the information alerts (described below).

The diagnostic checklists (DCLs) provide an automated method for assuring documentation of the key symptoms and behavioral issues that support the assigned diagnosis and for noting at subsequent evaluation points (e. g., at discharge) the degree of change in each symptom/behavior. In contrast to diagnostic assessments in other specialties, there are few procedures or laboratory tests that definitively establish a psychiatric diagnosis. The Diagnostic and Statistical Manual (DSM) of the American Psychi­atric Association [1987] and the International Classification of Diseases (ICD) published by the World Health Organization [1978] provide standard criteria on which to base diagnoses. In both DSM and ICD there is a specific algorithm that includes the type, number, and intensity of symptoms required for each diagnosis. The algorithm for all DSM diagnoses, modified to provide additional data, are used in the CEMS, and there is a separate 1-page checklist for each disorder. The clinician’s admission diagnosis determines which form is presented. The form can be completed by the clinician at the terminal or a paper version can be used with subsequent input by clerical staff. The clinician indicates if each symp­tom/behavior is present or absent, thereby ensuring complete documentation of the findings relevant to that diagnosis. If the symptoms specified fail to meet DSM criteria for the disorder selected, a message is generated along with an explanation. The DCL is also completed at the time of discharge, at which time the system generates a new version of the form that contains only those items that the clinician indicated were present on admission. (For implementation in settings with longer periods of treatment, the DCLs could also be used for serial assessments prior to discharge.)

The information alerts are computer-generated notifications that assist in ensuring compliance with practice guidelines and medication protocols. Clinicians may respond to these notices by changing the treatment plan or diagnosis or by following up on missing or abnormal laboratory orders as directed by the alert (i. e., the clinician is then in compliance with the treatment standard). The system also allows clinicians to document the reason for deviating from the standard. This step is initiated via a function key (see Screen 1); an item is then selected from a list of reasons for “nonstandard” practice or a free-text



Alert Details as of 7/12/94

Alert Nbr: Description:





Issued on: Last Notified on: Notification Nrb:

07/11/94 00:50 07/12/94 00:41 2

Patient Name: MRNR: Account:

Doe, John Q.



Patient DOB: Sex:





Screen 1





DIAL710A Alert Details as of 7/12/94


Patient: Doe, John Q.

MRNR: 123456

Account: 123456789


Suspension Reasons








подпись: 01
Medical work-up in progress to rule out organic cause of depression

Unable to tolerate antidepressants

Refusing antidepressants

Washout period

Drug-free trial

Patient’s affective episode characterized by prominent thought and/or perceptual disturbance, and is receiving a trial of antipsychotic medication Patient is receiving an antidepressant from an outside pharmacy Type:







Screen 2

Explanation is entered (see Screen 2). The alert statement with the attached explanation is printed on the next treatment plan review, a medical records form summarizing treatments and patient progress that the clinician must sign. Thus, the information alerts provide ongoing monitoring and critical event notification and ensure documentation of the rationale for any nonstandard practice. (All notifications and the clinician responses are reviewed by the medical director or designee daily.) The direct feedback provided by the alerts also serves an educational and information dissemination function. New infor­mation about medications, for example, can be incorporated into one or more alerts as well as added to the treatment guidelines described above.

The system currently has 130 alerts and can be expanded to search as many as 2000 records for up to 200 critical events each day. Additional alerts can be added without additional programming, and changes in the specifications for an existing alert (e. g., dosage parameters) can be made by accessing the appro­priate table from the maintenance screen. Alerts can include data elements on any medication prescribed, any laboratory value, any psychiatric diagnosis or a variety of historical/demographic items (e. g., gender, age). The limitation in the last of these four categories is the amount of information from the medical record that is currently on-line. With a completely electronic medical chart, this system could generate an alert for almost any clinical event.

Outcome Assessment

At admission, discharge, and regular intervals postdischarge nursing staff complete a functional assess­ment survey on all patients. These data are entered into the system using an optical scanner. Regular feedback is provided to hospital staff and used to evaluate clinical services and practice.

In a typical scenario a clinician, after logging on, would select “Show My Alerts” from the menu (Screen 3). Alerts for all of the patients assigned to that clinician are displayed as shown on Screen 4. (The medical director or other supervising clinician can select all active alerts by type, by practitioner, or by patient.) Patient name and alert message are given, but no further detail can be obtained nor can























Screen 3

TM0197-011. A.



Alert Details as of 7/12/94








Doe, John



02: 06/24/94


Doe, Mary



03: 07/12/94


Smith, John



04: 07/05/94


Jones, J. P.





Brown, Jane







• F2: MEDS



Screen 4

A direct response to the alert be made without selecting a specific patient and a specific alert for that patient. When the clinician goes to a specific alert, there is the option (Screen 1) to suspend the alert as described above or to query the system about the patient’s diagnosis, current medication, laboratory values, or the “history” of that alert (i. e., if the alert has previously been issued, a log of all activity on that alert is given). From the current medication and laboratory screens there is an additional option that allows access to all historical lab values/medications. This historical feature is especially valuable for tracking patients who require frequent drug serum level determinations (e. g., with lithium carbonate) or who have medical disorders that necessitate periodic laboratory tests (e. g., chronic kidney or liver disease).


Computers are increasingly user friendly, but the clinical environment may often not be computer friendly. In addition to the obvious computer engineering tasks, there is a sizable human engineering issue that must be addressed if automated tools are to be accepted and fully utilized in clinical settings. Thus, designers must take into account the social and organizational structure of the health care delivery network the system is intended to serve, involve clinician staff in its development, and adhere to a design strategy that will ensure end-user acceptance. Although the system outlined in this chapter was developed for psychiatric practice, the components of care described are common to all branches of medicine, and the design steps recommended are applicable for a wide range of clinical settings.


Many elements of the system now in place at the Institute of Living represent extensions of the earlier efforts of a number of individual. Peter Ericson, former head of the Department of Information Services at the Institute, Bernard C. Glueck, Jr., M. D., former director of research, were among the pioneers in applying information systems to clinical care. Major contributors to one or more components described in this chapter include Pawel Zmarlicki, David Cole, David Warchol, Russell Dzialo-Evans, Bonnie Szarek, R. N.

Defining Terms

Decision trees: Decision trees or flow charts are a common way to present medical information that

Is intended to support the decision-making process. Many decisions are hierarchical in nature, but there may be more than one acceptable option at each nodal point on the decision tree. Language/action model: The language/action model of human-computer communication is a term

Taken from the work of Winograd, who introduced the term “conversation for action” to describe how human behavior is coordinated within an organization. To quote Winograd “We work together by making commitments so that we can successfully anticipate the actions of others and coordinate them with our own.” According to this model, computer systems are part of the conversational structure of the organization and, along with the human employees, engage in a variety of well — defined interactions (communications). Winograd’s approach is based on earlier work by J. L. Austin and J. R. Searle.

Map: A map of the interactions is a detailed assessment, often diagrammatically expressed, of how the

Work of the organization is accomplished. In a medical facility, the steps involved include the physical assessment of the patient by the clinician, various procedures performed on the patient, and the recording, storage, retrieval, and analysis of data. Such maps can be informed by but do not necessarily have to follow Winograd’s language/action approach (described above) or any other formal model of analysis.


American Psychiatric Association. 1987. Diagnostic and Statistical Manual of Mental Disorders, 3d ed rev. Washington, DC, American Psychiatric Association.

Bronzino JD, Morelli RA, Goethe JW. 1989. Overseer: A prototype expert system for monitoring drug treatment in the psychiatric clinic. IEEE Trans Biomed Eng 36:533.

Lyytinen K. 1987. Different perspectives on information systems: Problems and solutions. ACM Com­puter Survey 19:5.

Morelli RA, Bronzino JD, Goethe JW. 1987. Expert Systems in psychiatry: A review. Proceedings 20th Hawaii International Conference System Science, 3:84.

Morelli RA, Bronzino JD, Goethe JW. 1993. Conversations for action: A speech act model of human­computer communications in a psychiatric hospital. J Intelli Sys 3:87.

Peters JP, Tseng S. 1983. Managing Strategic Change in Hospitals, Chicago, American Hospital Associa­tion.

Shortliffe EH, Duda R. 1983. Expert systems research. Science 220:261.

Winograd T. 1987. A language/action perspective on the design of cooperative work. Report No. STAN — CS-87-1158, Stanford University.

World Health Organization. 1978. Mental Disorders: Glossary and Guide to Their Classification in Accordance with the Ninth Revision of the International Classification of Disease, Geneva, World Health Organization.

Finkelstein, G. M. “Artificial Intelligence.”

The Biomedical Engineering Handbook: Second Edition.

Ed. Joseph D. Bronzino

Boca Raton: CRC Press LLC, 2000


Artificial Intelligence

Stanley M. Finkelstein

University of Minnesota

Artificial Intelligence in Medical Decision Making: History, Evolution, and Prospects Casimir A Kulikowski

Early Models of Medical Decision Making • Emergence of the Knowledge-Based AI Methods for Medical Consultation • The Transition to Expert Systems and the Ascendancy of Rule — Based Systems—1976-1982 • Exploration of Alternative Representations for Medical AI and the Search for Performance—1983-1987 • The Past Decade—Structure, Formalism, and Empiricism • Prospects for AI in Medicine—Problems and Challenges

Artificial Neural Networks: Definitions, Methods, Applications

Daniel A. Zahner, Evangelia Micheli-Tzanakou

Definitions • Training Algorithms • VLSI Applications of Neural Networks • Applications in Biomedical Engineering

Clinical Decision Systems Pirkko Nykanen, Niilo Saranummi

Clinical Decision Making • Clinical Decision Systems History • Clinical Application Areas and Types of Systems • Requirements and Critical Issues for Clinical Decision Systems Development • Evaluation of Clinical Decision Systems • Summary

Expert Systems: Methods and Tools Ron Summers, Ewart R. Carson,

Derek Cramp

Expert System Process Model • Knowledge Acquisition • Knowledge Representation • Other Methods and Tools • Summary

Knowledge Acquisition and Representation Catherine Garbay

Medical Expertise: Domain and Control Knowledge • Knowledge Acquisition • Knowledge Representation • Conclusion

Knowledge-Based Systems for Intelligent Patient Monitoring and Management in Critical Care Environments Benoit M. Dawant, Patrick R. Norris Intelligent Patient Monitoring and Management • Moving Toward Computer Architectures for Real-Time Intelligent Monitoring • Discussion and Conclusions

Medical Terminology and Diagnosis Using Knowledge Bases

Peter L. M. Kerkhof

Classification and Coding Systems • An Electronic Medical Encyclopedia at Your Fingertips: Knowledge Bases and Diagnosis • Problems Related to Medical Terminology • Solution for Discrepancies

Natural Language Processing in Biomedicine Stephen B. Johnson Linguistic Principles • Applications in Biomedicine • Challenges


HE FOCUS OF THIS SECTION is on artificial intelligence and its use in the development of medical decision systems with clinical application. The methodologic basis for expert systems and artificial neural networks in the medical/clinical domain will be discussed. This is one aspect of the growing field of medical or health informatics, the application of engineering, computer, and infor­mation sciences to problems in health and life sciences. Other informatics applications are addressed in Section XVII. This section contains eight original contributions that cover the history, methods, and future directions in the field. It intentionally omits specific computer hardware and software details related to currently available systems because they continue to change as technology changes and are likely to be outdated even as this Handbook goes to press. Furthermore, chapters representing complete examples of specific decision systems are not included in this section. Numerous example can be found in the current literature, and may have been cited in the chapters of this section to illustrate particular aspects of decision system structure, knowledge acquisition and representation, the user interface, and system evaluation and testing.

In Chapter 181, Dr. Kulikowski presents the history of the development of artificial intelligence (AI) methodology for medical decision making, from the early statistical and pattern-recognition methods of the 1960s to the continuing development of knowledge-based systems of the 1990s. Dr. Kulikowski has described the evolution of these AI applications. The early application explored various approaches to handling knowledge within the specific domains of interest, utilizing either causal networks, modular rule-based reasoning, or frame/template representation of the knowledge describing the clinical domain. These varied approaches pointed out the difficulties involved in the actual acquisition of the domain expertise needed for each application and generated research investigations into acquisition strategies ranging from literature review, case study evaluation, and detailed interviews with the experts. Alternative approaches to knowledge representation, the development and assessment of decision system applica­tions, efforts to test and evaluate decision system performance, and the problems associated with rea­soning in uncertain environments have become important areas of investigation. Currently, there is active interest in qualitative reasoning representation, the importance of the temporal framework of the decision process, and the effort to move toward more practical systems that embody decision support for diag­nostic or treatment protocols rather than the fully automated decision system. Dr. Kulikowski points out that expert clinical systems have not become the indispensable clinical tool that many had predicted in the early days of AI in medicine, describes reasons for the shortfall, and looks forward to advanced software environments and medical devices that are beginning to routinely incorporate these ideas and methods in their development.

In Chapter 182, Drs. Micheli-Tzanakou and Zahner focus on artificial neural network (ANN) method­ologies and their applications in the medical and health care arena. ANNs consist of a large number of interconnected neurons or nodes that can process information in a highly parallel manner. They are specified by their processing-element characteristics, their network topology, and the training rules they use to learn how to achieve correct pattern classification from an array of multiple inputs. Drs. Micheli-Tzanakou and Zahner provide the mathematical details for the back-propagation and ALOPEX training algorithms and discuss the benefits and deficiencies of these approaches in teaching the ANN to correctly classify input patterns presented to the system. Finally, the performance of several ANN approaches in mammography and chromosome and genetic sequence classification applications are reviewed and compared.

The contribution by Nykanen and Saranummi (Chapter 183) also provides a history of AI applications in medical decision systems but focuses on the clinical reasoning process, systems development within the clinical environment, and the critical issues for system acceptance. The early introduction of expert system shells provided a strong impetus for system development and indeed resulted in many published applications. The availability of inexpensive and powerful microcomputers and dedicated workstations for knowledge engineering also has contributed to the development of medical expert systems. The authors point out that data and knowledge are often less quantitative and consistent in clinical medicine than in the physical sciences and that moving from the knowledge level to the knowledge use level posed significant problems. Appropriate, objective, and standardized evaluation testing, and validation are often lacking as a part of the system development process, as discussed in this chapter, and this contributes to the lack of widespread acceptance of these systems. The authors conclude with a list of challenges for future clinical decision systems posed by the need for integration with new information technologies and the developing health care information infrastructure.

The chapter by Summers, Carson, and Cramp (Chapter 184) discusses specific methodologies used in expert system development and provides some general examples of rule-based and semantic network approaches to knowledge representation in medical applications. The authors use MYCIN and CASNET, two of the earliest medical expert system applications to achieve widespread recognition and serve as models for subsequent developments, to demonstrate and compare these two approaches for knowledge representation. MYCIN uses production rules to represent causal relationships between knowledge items to diagnose and treat microbial infections. CASNET uses causal or semantic networks to represent knowledge relationships in the diagnosis and treatment of glaucoma.

Dr. Garbay’s contribution (Chapter 187) continues along these directions by providing a somewhat more conceptual approach to the question of knowledge domains, acquisition, and representation. Knowledge-acquisition techniques are reviewed, and several knowledge-acquisition tools designed to assist in the process are described. Dr. Garbay also presents fundamental ideas related to rule-based, case — based, and causal reasoning. She discusses models for the uncertainty and imprecision that is central to the clinical environment. The use of temporal knowledge and its incorporation into clinical decision systems is also discussed. The ideas of shallow and deep knowledge systems are introduced from the perspective of the system’s explanation facilities. Shallow knowledge systems rely primarily on reasoning based on experience and experimental results, while deep knowledge systems are based on detailed knowledge regarding the structure and function of the underlying system such as those employing physiologic models within the knowledge base. Dr. Garbay, as have many contributors to this section, also introduces the new challenges for clinical decision systems with regard to the rapid developments in networking, communications, and information sciences.

A new and expanding application area involving intelligent patient monitoring and management in critical care environments is discussed in Chapter 186, by Drs. Dawant and Norris. Such systems involve the context-dependent acquisition, processing, analysis, and interpretation of large amounts of possibly noisy and incomplete data. These systems can be viewed within four distinct functional levels. At the signal level, the system acquires raw data from patient monitors, which include analog-to-digital con­version and some low-level signal processing. Data validity checks and artifact removal occur at the validation level. The transformation from numerical features that characterize the signals to a symbolic representation, such as normal or abnormal, is performed at the signal-to-symbol level. Finally, the inference level consists of the reasoning elements described in previous chapters to arrive at diagnoses, explanations, prediction, or initiation of control actions. Examples of both shallow and deep knowledge systems developed for patient monitoring are presented

Systems development and widespread dissemination often have been stymied by the lack of standard­ization of knowledge representation, as described in all the preceding contributions. The last two chapters look at this question from the perspective of medical terminology and the status of natural language processing in biomedicine.

In discussing applications of medical knowledge bases, Dr. Kerkhof (Chapter 187) states that while technological advances in computer processors and storage media permit virtually unlimited storage and fast retrieval capability, the interpretation of natural language constitutes a major obstacle in the advance­ment of knowledge-based systems in medicine. In this chapter, classification and coding systems are identified, and their content is described. The richness and depth of natural language are often the focus of difficulties when attempting to develop such systems, encountering differences in such elementary concerns as definitions, spelling, usage, and precision. Dr. Kerkhof offers several approaches to handling linguistic problems associated with such systems, but all have their own inherent limitations.

In Chapter 188, Dr. Johnson describes the techniques of natural language processing as a means to bridge the gap between textual and structured data so that users can interact with the system using familiar natural language, while computer applications can effectively process the resulting data. Appli­cations are classified according to the levels of language competence embodied in their design. Speech- recognition and — synthesis systems deal with basic data representation, wile lexical, syntactic, and dis­course systems function at increasing levels of complexity from single words to entire discourses.

While the specific applications described in this section relate to biomedical concerns, the definitions and methods overview relate to the development and utilization of expert systems and artificial neural networks for a wide variety of decision systems. Detailed implementation protocols for such systems are beyond the scope of this section but can be found in the references cited at the conclusion of each chapter.

Design Issues in Developing Clinical Decision Support and Monitoring Systems

Basic steps in the development of an expert system.

Ulikowski, C. “Artificial Intelligence in Medical Decision Making: History, Evolution, and Prospects.” he Biomedical Engineering Handbook: Second Edition. d. Joseph D. Bronzino oca Raton: CRC Press LLC, 2000

Non-AI Decision Making



Ron Summers

Loughborough University 179 3

Derek G. Cramp

City University, London

Ewart R. Carson

City University, London 179.4

подпись: 179.1
ron summers
loughborough university 179 3
derek g. cramp
city university, london
ewart r. carson
city university, london 179.4
Analytical Models Decision Theoretic Models Clinical Algorithms • Decision Trees • Influence Diagrams

Statistical Models

Database Search • Regression Analysis • Statistical Pattern Analysis • Bayesian Analysis • Dempster-Shafer Theory •

Syntactic Pattern Analysis • Causal Modeling • Artificial Neural Networks Summary

Non-AI decision making can be defined as those methods and tools used to increase information content in the context of some specific clinical situation without having cause to refer to knowledge embodied in a computer program. Theoretical advances in the 1950s added rigor to this domain when Meehl argued that many clinical decisions could be made by statistical rather than intuitive means [1]. Evidence of this view was supported by Savage [2], whose theory of choice under uncertainty is still the classical and most elegant formulation of subjective Bayesian decision theory, and was much responsible for reintro­ducing Bayesian decision analysis to clinical medicine. Ledley and Lusted provided further evidence that medical reasoning could be made explicit and represented in decision theoretic ways [3]. Decision theory also provided the means for Nash to develop a “Logoscope” which might be considered as the first mechanical diagnostic aid [4].

An information system developed using non-AI decision-making techniques may comprise procedural or declarative knowledge. Procedural knowledge maps the decision-making process into the methods by which the clinical problems are solved or clinical decisions made. Examples of techniques which form a procedural knowledge base are those which are based on algorithmic analytical models, clinical algo­rithms, or decision trees. Information systems based on declarative knowledge comprise what can essen­tially be termed a database of facts about different aspects of a clinical problem; the causal relationships between these facts form a rich network from which explicit (say) cause-effect pathways can be deter­mined. Semantic networks and causal probabilistic networks are perhaps the best examples of information systems based on declarative knowledge. There are other types of clinical decision aids, based purely on statistical methods applied to patient data, for example, classification analyses based on 1ogistic regres­sion, relative frequencies of occurrence, pattern-matching algorithms, or neural networks.

The structure of this chapter mirrors to some extent the different methods and techniques of non-AI decision making mentioned above. It is important to distinguish between analytical models based on quantitative or qualitative mathematical representations and decision theoretic methods typified by the
Use of clinical algorithms, decision trees, and set theory. Most of the latter techniques add to an infor­mation base by way of procedural knowledge. It is then, that advantage can be taken of the many techniques which have statistical decision theoretic principles as their underpinning.

This section begins with a discussion of simple linear regression models and pattern recognition, but then more complex statistical techniques are introduced, for example, the use of Bayesian decision analysis which leads to the introduction of causal probabilistic networks. The majority of these techniques add information by use of declarative knowledge. Particular applications are used throughout to illustrate the extent to which non-AI decision making is used in clinical practice.

Analytical Models

In the context of this chapter, the analytical models considered are qualitative and quantitative mathe­matical models that are used to predict future patient state based on present state and a historical representation of what has passed. Such models could be representations of system behavior that allow test signals to be used so that response of the system to various disturbances can be studied, thus making predictions of future patient state.

For example, Leaning and co-workers [5,6] produced a 19-segment quantitative mathematical model of the blood circulation to study the short-term effects of drugs on the cardiovascular system of normal, resting patients. The model represented entities such as the compliance, flow, and volume of model segments in what was considered a closed system. In total, the quantitative mathematical model com­prised 61 differential equations and 159 algebraic equations. Evaluation of the model revealed that it was fit for its purpose in the sense of heuristic validity, that is it could be used as a tool for developing explanations for cardiovascular control, particularly in relation to the CNS.

Qualitative models investigate time-dependent behavior by representing patient state trajectory in the form of a set of connected nodes, the links between the nodes reflecting transitional constraints placed on the system [7]. The types of decision making supported by this type of model are assessment and therapy planning. In diagnostic assessment, the precursor nodes and the pathway to the node (decision) of interest define the causal mechanisms of the disease process. Similarly, for therapy planning, the optimal plan can be set by investigation of the utility values associated with each link in the disease-therapy relationship. These utility values refer to a cost function, where cost can be defined as the monetary cost of providing the treatment and cost benefit to the patient in terms of efficiency, efficacy, and effectiveness of alternative treatment options. Both quantitative [8] and qualitative [9] analytical models can be realized in other ways to form the basis of rule-based systems; however that excludes their analysis in this chapter.

Decision Theoretic Models

Clinical Algorithms

The clinical algorithm is a procedural device that mimics clinical decision making by structuring the diagnostic or therapeutic decision processes in the form of a classification tree. The root of the tree represents some initial state, and the branches yield the different options available. For the operation of the clinical algorithm, the choice points are assumed to follow branching logic with the decision function being a yes/no (or similar) binary choice. Thus, the clinical algorithm comprises a set of questions that must be collectively exhaustive for the chosen domain, and the responses available to the clinician at each branch point must be mutually exclusive. These decision criteria pose rigid constraints on the type of medical problem that can be represented by this method, as the lack of flexibility is appropriate only for a certain set of well-defined clinical domains. Nevertheless, there is a rich literature available; examples include the use of the clinical algorithm for acid-base disorders [10] and diagnosis of mental disorders [11].

Decision Trees

A more rigorous use of classification tree representations than the clinical algorithm can be found in decision tree analysis. Although from a structural perspective, decision trees and clinical algorithms are similar in appearance, for decision tree analysis the likelihood and cost-benefit for each choice are also calculated in order to provide a quantitative measure for each option available. This allows the use of optimization procedures to gauge the probability of success for the correct diagnosis being made or for a beneficial outcome from therapeutic action being taken. A further difference between the clinical algorithm and decision tree analysis is that the latter has more than one type of decision node (branch point): at decision nodes the clinician must decide on which choice (branch) is appropriate for the given clinical scenario; at chance nodes the responses available have no clinician control, for example, the response may be due to patient specific data; and outcome nodes define the chance nodes at the “leaves” of the decision tree. That is, they summarize a set of all possible clinical outcomes for the chosen domain.

The possible outcomes from each chance node must obey the rules of probability and sum to unity; the probability assigned to each branch reflects the frequency of that event occurring in a general patient population. It follows that these probabilities are dynamic, with accuracy increasing, as more evidence becomes available. A utility value can be added to each of the outcome scenarios. These utility measures reflect a trade-off between competing concerns, for example, survivability and quality of life, and may be assigned heuristically.

When the first edition of this chapter was written in 1995 it was noted that although a rich literature describing potential applications existed [12], the number of practical applications described was limited. The situation has changed and there has been an explosion of interest in applying decision analysis to clinical problems. Not only is decision analysis methodology well described [13-16] but there are numer­ous articles appearing in mainstream medical journals, particularly Medical Decision Making. An impor­tant driver for this acceleration of interest has been the desire to contain costs of medical care, while maintaining clinical effectiveness and quality of care. Cost-effectiveness analysis is an extension of decision analysis and compares the outcome of decision options in terms of the monetary cost per unit of effectiveness. Thus, it can be used to set priorities for the allocation of resources and to decide between one or more treatment or intervention options. It is most useful when comparing treatments for the same clinical condition. Cost-effectiveness analysis and its implications are described very well elsewhere [17,18].

Influence Diagrams

In the 1960s researchers at Stanford Research Institute (SRI) proposed the use of influence diagrams as representational models when developing computer programs to solve decision problems. However, it was recognized somewhat later by decision analysts at SRI [19] that such diagrams could be used to facilitate communication with domain experts when eliciting information about complex decision prob­lems. Influence diagrams are a powerful mode of graphic representation for decision modeling. They do not replace but complement decision trees, and it should be noted that both are different graphical representations of the same mathematical model and operations. Recently, two exciting papers have been published that make use of influence diagrams accessible to those interested in medical decision making [20,21].

Statistical Models Database Search

Interrogation of large clinical databases yields statistical evidence of diagnostic value and in some rep­resentations form the basis of rule induction used to build expert systems [22]. These systems will not

Be discussed here. However the most direct approach for clinical decision making is to determine the relative frequency of occurrence of an entity, or more likely a group of entities, in the database of past cases. This enables a prior probability measure to be estimated [23]. A drawback of this simple, direct approach to problem solving is the apparent tautology of the more evidence available leading to fewer matches in the database being found; this runs against common wisdom that more evidence leads to an increase in probability of a diagnosis being found. Further, the method does not provide a weight for each item of evidence to gauge those that are more significant for patient outcome.

Regression Analysis

Logistic regression analysis is used to model the relationship between a response variable of interest and a set of explanatory variables. This is achieved by adjusting the regression coefficients, the parameters of the model, until a ‘best fit’ to the data set is achieved. This type of model improves upon the use of relative frequencies, as logistic regression explicitly represents the extent to which elements of evidence are important in the value of the regression coefficients. An example of clinical use can be found in the domain of gastroenterology [24].

Statistical Pattern Analysis

The recognition of patterns in data can be formulated as a statistical problem of classifying the results of clinical findings into mutually exclusive but collectively exhaustive decision regions. In this way, not only can physiologic data be classified but also the pathology that they give rise to and the therapy options available to treat the disease. Titterington [25] describes an application in which patterns in a complex data set are recognized to enhance the care of patients with head injuries. Pattern recognition is also the cornerstone of computerized methods for cardiac rhythm analysis [26]. The methods used to distinguish patterns in data rely on discriminant analysis. In simple terms, this refers to a measure of separability between class populations.

In general, pattern recognition is a two stage process as shown in Fig. 179.1. The pattern vector, P, is an n-dimensional vector derived from the data set used. Let Qp be the pattern space, which is the set of all possible values P may assume, then the pattern recognition problem is formulated as finding a way of dividing Qp into mutually exclusive and collectively exhaustive regions. For example, in the analysis of the electrocardiogram the complete waveform may be used to perform classifications of diagnostic value. A complex decision function would probably be required in such cases. Alternatively (and if appropriate), the pattern vector can be simplified to investigation of sub features within a pattern. For cardiac arrhythmia analysis, only the R-R interval of the electrocardiogram is required which allows a much simpler decision function to be used. This may be a linear or non-linear transformation process:

X = tP

Where X is termed the feature vector and t is the transformation process.








Patlcm v« lor P

Feature vecior XstP M<N


C = 6X

Just as the pattern vector P belongs to a pattern space Qp, so the feature vector X belongs to a feature space QX. As the function of feature extraction is to reduce the dimensionality of the input vector to the classifier, some information is lost.

Classification of QX can be achieved using numerous statistical methods including: discriminant func­tions (linear and polynomial); kernel estimation; k-nearest neighbor; cluster analysis; and Bayesian analysis.

Bayesian Analysis

Ever since their reinvestigation by Savage in 1954 [2], Bayesian methods of classification have provided one of the most popular approaches used to assist in clinical decision making. Bayesian classification is an example of a parametric method of estimating class conditional probability density functions. Clinical knowledge is represented as a set of prior probabilities of diseases to be matched with conditional probabilities of clinical findings in a patient population with each disease. The classification problem becomes one of a choice of decision levels, which minimizes the average rate of misclassification or to minimise the maximum of the conditional average loss function (the so-called minimax criterion) when information about prior probabilities is not available. Formally, the optimal decision rule which mini­mizes the average rate of misclassification is called the Bayes rule; this serves as the inference mechanism that allows the probabilities of competing diagnoses to be calculated when patient specific clinical findings become available.

The great advantage of Bayesian classification is that a large clinical database of past cases is not required, thus allowing the time taken to reach a decision to be faster compared with other database search techniques; furthermore, classification errors due to the use of inappropriate clinical inferences are quantifiable. However, a drawback of this approach to clinical decision making is that the disease states are considered as complete and mutually exclusive, whereas in real life neither assumption may be true.

Nevertheless, Bayesian decision functions as a basis for differential diagnosis have been used success­fully, for example, in the diagnosis of acute abdominal pain [27]. De Dombal first described this system in 1972, but it took another 20 years or so for it to be accepted via a multi-center multinational trial. The approach has been exploited in ILIAD; this is a commercially available [28] computerized diagnostic decision support system with some 850-990 frames in its knowledge base. As it is a Bayesian system, each frame has the prevalence of a disease for its prior probability. There is the possibility however, that the prevalence rates may not have general applicability. This highlights a very real problem, namely the validity of relating causal pathways in clinical thinking and connecting such pathways to a body of validated (true) evidence. Ideally, such evidence will come from randomized controlled clinical or epi­demiological trials. However, such studies may be subject to bias. To overcome this, Eddy and co-workers devised the Confidence Profile Method [29]. This is a set of quantitative techniques for interpreting and displaying the results of individual studies (trials); exploring the effects of any biases that might affect the internal validity of the study; adjusting for external validity; and, finally, combining evidence from several sources. This meta-analytical approach can formally incorporate experimental evidence and, in a Bayesian fashion, also the results of previous analytical studies or subjective judgments about specific factors that might arise when interpreting evidence. Influence diagram representations play an important role in linking results in published studies and estimates of probabilities and statements about causality.

Currently, much interest is being generated as to how, what is perceived as the Bayesian action-oriented approach can be used in determining health policy, where the problem is perceived to be a decision problem rather than a statistical problem, (see for instance Lilford and Braunholz [30]).

Dempster-Shafer Theory

One way to overcome the problem of mutually exclusive disease states is to use an extension to Bayesian classification put forward by Dempster [31] and Shafer [32]. Here, instead of focusing on a single disorder,

The method can deal with combinations of several diseases. The key concept used is that the set of all possible diseases is partitioned into n-tuples of possible disease state combinations. A simple example will illustrate this concept. Suppose there is a clinical scenario in which four disease states describe the whole hypothesis space. Each new item of evidence will impact on all the possible subsets of the hypothesis space and is represented by a function, the basic probability assignment. This measure is a belief function that must obey the laws of probability and sum to unity across the subsets impacted upon. In the example, all possible subsets comprise: one which has all four disease states in it; four which have three of the four diseases as members; six which have two diseases as members; and finally, four subsets which have a single disease as a member. Thus, when new evidence becomes available in the form of a clinical finding, only certain hypotheses, represented by individual subsets, may be favored.

Syntactic Pattern Analysis

As demonstrated above, a large class of clinical problem solving using statistical methods involves classification or diagnosis of disease states, selection of optimal therapy regimes, and prediction of patient outcome. However, in some cases the purpose of modelling is to reconstruct the input signal from the data available. This cannot be done by methods discussed thus far. The syntactic approach to pattern recognition uses a hierarchical decomposition of information and draws upon an analogy to the syntax of language. Each input pattern is described in terms of more simple subpatterns which themselves are decomposed into simpler subunits, until the most elementary subpatterns, termed the pattern primitives, are reached. The pattern primitives should be selected so that they are easy to recognize with respect to the input signal. Rules which govern the transformation of pattern primitives back (ultimately) to the input signal are termed the grammar.

In this way a string grammar, G, which is easily representable in computer-based applications, can be defined:

G = {Vt, Vn, S, P}

Where, Vt are the terminal variables (pattern primitives); Vn are the nonterminal variables; S is the start symbol; and P is the set of production rules which specify the transformation between each level of the hierarchy. It is an important assumption that in set theoretic terms, the union of Vt and Vn is the total vocabulary of G, and the intersection of Vt and Vn is the null (empty) set.

A syntactic pattern recognition system therefore comprises three functional subunits (Fig. 179.2): a pre­processor—this manipulates the input signal, P, into a form that can be presented to the pattern descriptor; the pattern descriptor which assigns a vocabulary to the signal; and the syntax analyzer which classifies the signal accordingly. This type of system has been used successfully to represent the electrocardiogram [33, 34] and the electroencephalogram [35] and for representation of the carotid pulse wave [36].

Causal Modeling

A causal probabilistic network (CPN) is an acyclic multiply-connected graph which at a qualitative level comprises nodes and arcs [37]. Nodes are the domain objects and may represent, for example, clinical

Non-AI Decision Making

Findings, pathophysiologic states, diseases, or therapies. Arcs are the causal relationships between succes­sive nodes and are directed links. In this way the node and arc structure represent a model of the domain. Quantification is expressed in the model by a conditional probability table being associated with each arc, allowing the state of each node to be represented as a binary value, or more frequently as a continuous probability distribution. In root nodes the conditional probability table reduces to a probability distri­bution of all its possible states.

A key concept of CPNs is that computation is reduced to a series of local calculations, using only one node and those that are linked to it in the network. Any node can be instantiated with an observed value; this evidence is then propagated through the CPN via a series of local computations. Thus, CPNs can be used in two ways: to instantiate the leaf nodes of the network with known patterns for given disorders to investigate expected causal pathways; or to instantiate the root nodes or nodes in the graphical hierarchy with, for example, test results to obtain a differential diagnosis. The former method has been used to investigate respiratory pathology [38], and the latter method has been used to obtain pathologic infor­mation for electromyography [39].

Artificial Neural Networks

Artificial neural networks (ANNs) mimic their biologic counterparts, although at the present time on a much smaller scale. The fundamental unit in the biological system is the neuron. This is a specialized cell which, when activated, transmits a signal to its connected neighbors. Both activation and transmission involve chemical transmitters which cross the synaptic gap between neurons. Activation of the neuron takes place only when a certain threshold is reached. This biologic system is modelled in the representation of an artificial neural network. It is possible to identify three basic elements of the neuron model: a set of weighted connecting links that form the input to the neuron (analogous to neurotransmission across the synaptic gap); an adder for summing the input signals; and an activation function which limits the amplitude of the output of the neuron to the range (typically) -1 to +1. This activation function also has a threshold term that can be applied externally and forms one of the parameters of the neuron model. Many books are available which provide a comprehensive introduction to this class of model [e. g., 40].

ANNs can be applied to two categories of problems: prediction and classification. It is the latter which has caught the imagination of biomedical engineers for its similarity to diagnostic problem solving. For instance, the conventional management of patients with septicemia requires a diagnostic strategy that takes up to 18 to 24 h before initial identification of the causal microorganism. This can be compared to a method in which an ANN is applied to a large clinical database of past cases; the quest becomes one of seeking an optimal match between present clinical findings and patterns present in the recorded data. In this application pattern matching is a non-trivial problem as each of the 5000 past cases has 51 data fields. It has been shown that for this problem the ANN method outperforms other statistical methods such as k-nearest neighbor [41].


This chapter has reviewed what are normally considered to be the major categories of approach available to support clinical decision making which do not rely on what is classically termed artificial intelligence (AI). They have been considered under the headings of analytical, decision theoretic, and statistical models, together with their corresponding subdivisions. It should be noted, however, that the division into non-AI approaches and AI approaches that is adopted in this volume (see Chapter 184) is not totally clear-cut. In essence, the range of approaches can in many ways be regarded as a continuum. There is no unanimity as to where the division should be placed and the separation adopted; here is but one of a number that are feasible. It is therefore desirable that the reader should consider these two chapters together and choose an approach that is relevant to their particular clinical context.


Meehl R 1954. Clinical versus Statistical Prediction, Minnesota, University of Minnesota Press.

Savage LI. 1954. The Foundations of Statistics, New York, Wiley.

Ledley RS, Ludsted LB. 1959. Reasoning foundations of medical diagnosis. Science, 130:9.

Nash FA. 1954. Differential diagnosis: An apparatus to assist the logical faculties. Lancet 4:874.

Leaning MS, Pullen HE, Carson ER, et al. 1983. Modelling a complex biological system: The human cardiovascular system: 1. Methodology and model description. Trans. Inst. Meas. Contr. 5:71.

Leaning MS, Pullen HE, Carson ER, et al. 1983. Modelling a complex biological system: The human cardiovascular system: 2. Model validation, reduction and development. Trans. Inst. Meas. Contr. 5:87.

Kuipers BJ. 1986. Qualitative simulation. Artif. Intell. 29:289.

Furukawa T, Tanaka H, Hara S. 1987. FLUIDEX: A microcomputer-based expert system for fluid

Therapy consultations. In: MK Chytil, R Engelbrecht (Eds.), Medical Expert Systems, pp. 59-74, Wilmslow, Sigma Press.

Bratko I, Mozetic J, Lavrac N. 1988. In: D Michie, I Bratko (Eds.), Expert Systems: Automatic Knowledge Acquisition, pp. 61-83, Reading, Mass, Addison-Wesley.

Bleich HL. 1972. Computer-based consultations: Electrolyte and acid-base disorders. Amer. J. Med. 53:285.

McKenzie DP, McGary PD, Wallac et al.1993. Constructing a minimal diagnostic decision tree. Meth. Inform. Med. 32:161.

Pauker SG, Kassirer JP. 1987. Decision analysis. N. Engl. J. Med. 316:250.

Weinstein MC, Fineberg HV. 1980. Clinical Decision Analysis. London, Saunders.

Watson SR, Buede DM. 1994. Decision Synthesis. Cambridge, Cambridge University Press.

Sox HC, Blatt MA, Higgins MC, Marton KI. 1988. Medical Decision Making. Boston, Butterworth Heinemann.

Llewelyn H, Hopkins A. 1993. Analysing How We Reach Clinical Decisions. London, Royal College of Physicians.

Gold MR, Siegel JE, Russell LB, Weinstein MC. (Eds.) 1996. Cost-effectiveness in Health and Med­icine. New York, Oxford University Press.

Sloan FA. (Ed.). 1996. Valuing Health Care. Cambridge, Cambridge University Press.

Owen DL. 1984. The use of influence diagrams in structuring complex decision problems. In: RA Howard and JE Matheson. (Eds.). Readings on the Principles and Applications of Decision Analysis. vol. 2. Menlo Park, CA, Strategic Decisions Group, pp. 763-72.

Owens DK, Shachter RD, Nease RF. 1997. Representation and analysis of medical decision problems with influence diagrams. Med. Dec. Mak. 17:241.

Nease RF, Owens DK. 1997. Use of influence diagrams to structure medical decisions. Med. Dec. Mak. 17:263.

Quinlan JR. 1979. Rules by induction from large collections of examples. In: D Michie (Ed.), Expert Systems in the Microelectronic Age, Edinburgh, Edinburgh University Press.

Gammerman A, Thatcher AR. 1990. Bayesian inference in an expert system without assuming independence. In: MC Golumbic (Ed.), Advances in Artificial Intelligence, pp. 182-218, New York, Springer-Verlag.

Spiegelhalter DJ, Knill-Jones RP. 1984. Statistical and knowledge-based approaches to clinical decision-support systems with an application in gastroenterology. J. Roy. Stat. Soc. A 147:35.

Titterington DM, Murray GD, Murray LS, et al. 1981. Comparison of discriminant techniques applied to a complex set of head injured patients. J. Roy. Stat. Soc. A 144:145.

Morganroth J. 1984. Computer recognition of cardiac arrhythmias and statistical approaches to arrhythmia analysis. Ann. NY Acad. Sci. 432:117.

De Dombal FT, Leaper DJ, Staniland JR, et al. 1972. Computer-aided diagnosis of acute abdominal pain. Br. Med. J. 2:9.

Applied Medical Informatics, Salt Lake City, UT.

Eddy DM, Hasselblad V, Shachter R. 1992. Meta-Analysis by the Confidence Profile. London, Aca­demic Press.

Lilford RJ, Braunholz D. 1996.The statistical basis of public policy: a paradigm shift is overdue. Br. Med. J. 313: 603.

Dempster A. 1967. Upper and lower probabilities induced by multi-valued mapping. Ann. Math Stat. 38:325.

Shafer G. 1976. A Mathematical Theory of Evidence, Princeton, NJ, Princeton University Press.

Belforte G, De Mori R, Ferraris E 1979. A contribution to the automatic processing of elec­trocardiograms using syntactic methods. IEEE Trans. Biomed. Eng. BME-26 (3):125.

Birman KP. 1982. Rule-based learning for more accurate ECG analysis. IEEE Trans. Pat. Anal. Mach. Intell. PAMI-4 (4):369.

Ferber G. 1985. Syntactic pattern recognition of intermittant EEG activity. Meth. Inf. Med. 24 (2):79.

Stockman GC, Kanal LN. 1983. Problem reduction in representation for the linguistic analysis of waveforms. IEEE Trans. Pat. Anal. Mach. Intell. PAMI-5 (3):287.

Andersen SK, Jensen FV, Olesen KG. 1987. The HUGIN Core-Preliminary Considerations on Induc­tive Reasoning: Managing Empirical Information in AI Systems. Riso, Denmark.

Summers R, Andreassen S, Carson ER, et al. 1993. A causal probabilistic model of the respiratory system. In: Proc. IEEE 15th Ann. Conf. Eng. Med. Biol. Soc., New York, IEEE, pp. 534-535.

Jensen FV, Andersen SK, Kjaerulff U, et al. 1987. MUNIN: On the case for probabilities in medical expert systems—a practical exercise. In: J Fox, M Fieschi, R Engelbrecht (Eds.), Proc. First Conf. Europ. Soc. AI Med., pp. 149-160, Heidelberg, Springer-Verlag.

Haykin S. 1994. Neural Networks: A Comprehensive Foundation, New York, Macmillan.

Worthy PJ, Dybowski R, Gransden WR, et al. 1993. Comparison of learning vector quantisation and nearest neighbour for prediction of microorganisms associated with septicaemia. In: Proc. IEEE 15th Annu. Conf. Eng. Med. Biol. Soc., New York, IEEE, pp. 273-274.

The, J. W., Bronzino, J. D. “Design Issues in Developing Clinical Decision Support and Monitoring Syste Biomedical Engineering Handbook: Second Edition.

Joseph D. Bronzino a Raton: CRC Press LLC, 2000