Medical FactNet Barry Smith Christiane Fellbaum University at Buffalo and IFOMIS, Leipzig

Transcription

Medical FactNet Barry Smith Christiane Fellbaum University at Buffalo and IFOMIS, Leipzig
Medical FactNet
Barry Smith
University at Buffalo and IFOMIS, Leipzig
Christiane Fellbaum
Princeton University and Berlin Academy
Online-Inquiry to MEDLINEplus
Query text
response (with links to documents sorted by the
following keywords)
tremor
Tremor, Multiple Sclerosis, Parkinson’s Disease,
Degenerative Nerve Diseases, Movement Disorders
intentional tremor
Tremor, Multiple Sclerosis, Parkinson’s Disease,
Spinal Muscular Atrophy, Degenerative Nerve
Diseases
tremble
Anxiety, Parkinson’s Disease, Panic Disorder,
Caffeine, Tremor
trembling
Anxiety, Parkinson’s Disease, Panic Disorder,
Phobias, Tremor
right hand trembles
Phobias, Anxiety, Infant and Toddler Development,
Parkinson’s Disease, Diabetes
right hand trembles
when grasping
Infant and Toddler Development, Sports Fitness,
Sports Injuries, Diabetes, Rehabilitation
Online-Inquiry to MEDLINEplus
Query text
response (with links to documents sorted by the
following keywords)
tremor
Tremor, Multiple Sclerosis, Parkinson’s Disease,
Degenerative Nerve Diseases, Movement Disorders
intentional tremor
Tremor, Multiple Sclerosis, Parkinson’s Disease,
Spinal Muscular Atrophy, Degenerative Nerve
Diseases
tremble
Anxiety, Parkinson’s Disease, Panic Disorder,
Caffeine, Tremor
trembling
Anxiety, Parkinson’s Disease, Panic Disorder,
Phobias, Tremor
right hand trembles
Phobias, Anxiety, Infant and Toddler Development,
Parkinson’s Disease, Diabetes
right hand trembles
when grasping
Infant and Toddler Development, Sports Fitness,
Sports Injuries, Diabetes, Rehabilitation
A consumer health medical information
system must be able to map between
expert and non-expert medical vocabulary
GOAL: A unified medical language system
for non-expert medical vocabulary
UMLS for dummies
A New Methodology for the Construction
and Validation of Information Resources
for Consumer Health
MWN: SPECIFIC AIMS
to extend and validate WordNet 2.0’s medical
coverage in light of recent advances in medical
terminology research
focusing initially on the English-language single
word expressions used and understood by nonexperts
provision of a mapping to UMLS, MeSH, and other
expert terminologies
use as interlingua for MWNs in other languages
WordNet (Miller, Fellbaum)
Large lexical database; ubiquitous tool of NLP
coverage comparable to collegiate dictionary,
over 130,000 word forms
40 wordnets in different languages
WordNet: rich medical coverage, but pooly
validated and poor formal architecture
How create a validated Medical WordNet
(MWN)?
Building blocks of WordNet = ‘synsets’
= ‘concepts’ in medical terminology
terms in same synset = they are interchangeable
in some sentential contexts without altering
truth-value:
{car, automobile}, {shut, close}
synsets linked via small number of binary relations:
is-a
part-of
verb entailments: (walk-limp, forget-know).
Strengths of WordNet 2.0
Open source
Very broad coverage
Is-a / part-of architecture
Tool for automatic sense disambiguation
13 senses for feel is a verb
experience – She felt resentful
find – I feel that he doesn't like me
feel – She felt small and insignificant;
feel – We felt the effects of inflation
feel – The sheets feel soft
grope –He felt for his wallet
finger – Feel this soft cloth!
explore – He felt his way around the dark room)
feel – It feels nice to be home again
feel – He felt the girl in the movie theater)
Medical senses of ‘feel’
palpate – examine a body part by palpation:
The nurse palpated the patient's stomach; The
runner felt her pulse.
sense – perceive by a physical sensation, e.g.
coming from the skin or muscles:
He felt his flesh crawl; She felt the heat when
she got out of the car; He feels pain when he
puts pressure on his knee.
feel – seem with respect to a given sensation:
My cold is gone – I feel fine today; She felt tired
after the long hike.
MWN
many word units are monosemic (clinician,
stethoscope)
most common words are polysemic
lexicon of the order of 4000 word units
with some 3,000 distinct word senses.
tested by incorporation in NLP applications used
for purposes of information retrieval, machine
translation, question-answer systems, text
summarization
How to validate Medical WordNet?
How to fix the scope of ‘non-expert’?
Answer: Medical FactNet (MFN)
a large corpus of natural-language
sentences providing medically validated
contexts for MWN terms.
pilot corpus: 40,000 sentences
full MFN (for common diseases): ~250,000
sentences
accredited as intelligible by non-experts
and as true by experts
Medical BeliefNet (MBN)
= totality of sentences about medical
phenomena to which non-experts assent
comes for free, given our methodology for
creating MFN
Sources for MFN
1. WordNet glosses and arcs
2. Online health information services
targeted to consumers
NetDoctor, MEDLINEplus
(factsheets on common diseases)
Constructing MBN and MFN
sources (WordNet, MEDLINEplus …)
filtering for intelligibility by non-experts
pool of natural language sentences
filtering for non-expert assent
Medical BeliefNet
filtering for validation by experts
?
Medical FactNet
MFN: SPECIFIC AIMS
To create a pilot open-source corpus of sentences
about medical phenomena in the English
language
restricted to natural language
grammatically complete
logically and syntactically simple sentences
rated as understandable by non-expert human
subjects in controlled questionnaire-based
experiments
MFN: SPECIFIC AIMS
= sentences must be self-contained
make no reference to any prior context
not contain any proper names, indexical
expressions or other linguistic devices that
need to be interpreted with respect to
other sentences.
Constructing MFN
Sentences in MFN must receive high marks
for correctness on being assessed by
medical experts.
MFN designed to constitute a representative
fraction of the true beliefs about medical
phenomena which are intelligible to nonexpert English-speakers.
Constructing MBN
Sentences in MBN must receive high marks
for assent on being assessed by nonexperts.
MBN designed to constitute a representative
fraction of the beliefs about medical
phenomena (both true and false beliefs)
distributed through the population of
English speakers.
Compiling MFN and MBN in tandem
will allow systematic assessment of the disparity
between lay beliefs and vocabulary as concerns
medical phenomena and the exactly
corresponding expert medical knowledge.
will allow us to establish automatically for any
given sub-population which areas its beliefs
about medical phenomena differ most
significantly from validated medical knowledge
USES OF MFN
for quality assurance of MWN
to support the population of MWN by yielding
new families of words and word senses
medical education
consumer health information
(in conjunction with MBN) allow new sorts of
experiments in the linguistics, psychology
and anthropology of consumer health
Evaluation of MFN
measure the benefits it brings when
incorporated into an existing on-line
consumer health portal based on termsearch technology.
test whether exploiting the resources of
MFN can lead to improved results in the
retrieval of expert information
Differences between expert and nonexpert medical language
mismatch between expert and non-expert
language
taxonomies reflecting popular lexicalizations
have small coverage relative to technical
vocabularies
and shallow hierarchies:
no popular terms linking infectious disease and
mumps
Differences between expert and nonexpert medical language
popular medical terms (flu) often fuzzier than
technical terms
extension of non-expert term used also by
experts sometimes smaller, sometimes
larger
hypothesis: with few exceptions the focal
meanings coincide in their extensions
Mismatches in Doctor-Patient
Communication
Practical skills of physician in acquiring and
conveying relevant and reliable
information by using non-expert language
tailored to individual patient
The physician, too, is a human being, thus
ex officio a member of the wider
community of non-experts
 continues to use non-expert language for
everyday purposes
But there are problems
Question: My seven-year-old son developed a rash today … a
friend of mine had her 10-day-old baby at my home last
evening before we were aware of the illness. … I have read
that chickenpox is contagious up to two days prior to the
actual rash. Is there cause for concern at this point?
Answer: Chickenpox is the common name for varicella
infection. ...
You are correct in that a person with chickenpox can be
contagious for 48 hours before the first vesicle is seen. ...
Of concern, though, is the fact that newborns are at higher
risk of complications of varicella, including pneumonia. ...
There is a very effective means to prevent infection after
exposure. A form of antibody to varicella called varicellazoster immune globulin (VZIG) can be given up to 48 hours
after exposure and still prevent disease. ...
(from Slaughter)
Lexical mismatches
rooted in legal concerns?
both primary care physician and online information
system must respond primarily with generic, or
case- or context-independent, information
most requests relate to specific and episodic
phenomena (occurrences of pain, fever,
reactions to drugs, etc.).
Hence focus of MFN on generic sentences =
context-independent statements about causality,
about types of persons or diseases or about
typical or possible courses of a disease.
MFN
designed to map the generic medical
information which non-experts are able to
understand
Corpus- and fact-based approaches
to information retrieval
meanings of highly polysemous terms cannot be
discriminated without consideration of their
contexts.
People do this without apparent difficulties
New NLP methodologies to harness computers to
manipulate large text corpora
Train automatic systems on large numbers of
semantically annotated sentences, exploit
standard pattern-recognition and statistical
techniques for purposes of disambiguation.
Use of WordNet in medical
informatics
e.g. as tool for simplifying information
extraction from the corpus of MEDLINE
abstracts:
by replacing verbs with corresponding
synsets and so reducing the number of
relations that need to be taken account of
in the analysis of texts
Example: FrameNet
500 Frames, each with a plurality of Frame
Elements
Medical Frames:
Addiction, Birth, Biological Urge, Body Mark,
Cure, Death, Health Response, Medical
Conditions, Medical Instruments, Medical
Professional, Medical Specialties and
Observable Body Parts.
Frame: Cure
Frame Elements:
alleviate. v, alleviation. n, curable. a,
curative. a, curative. n, cure. n, cure. v,
ease. v, heal. v, healer. n, incurable. a,
palliate. v, palliation. n, palliative. a,
palliative. n, rehabilitate. v, rehabilitation.
n, rehabilitative. a, remedy. n, resuscitate.
v, therapeutic. a, therapist. n, therapy. n,
treat. v, treatment. n.
Example: Penn Proposition Bank
designed as a corpus of coherent texts. The
intention is to train an automatic system to
‘learn’ the contexts for words and their
context-specific meanings.
corpus characterized by a specific logical
(function-argument-based) architecture.
Both FrameNet and Proposition
Bank
have poor medical coverage
Both focus on word usage in general, rather
than on domain-specific contexts.
Neither concerned with the questions of
factuality or validation of statements
Example: CYC knowledge base
collection of hundreds of thousands of
statements mostly about the external
world:
The earth is round
Mountains are one kind of landform
Albany is the capital of New York
parcelled into micro-theories
In contrast to CYC,
(i) MFN focuses on one single (albeit very
large) domain
(ii) MFN stores English sentences (CYC is
language non-specific);
(iii) MFN discriminates folk beliefs and
expert knowledge (designed to be
consistent with the body of established
science;
(iv) MFN will be publicly available.
Existing Princeton WordNet 2.0
labels 504 word-forms ‘medicine’:
infection#1 {(the pathological state resulting
from the invasion of the body by pathogenic
microorganisms)}
infection#3 {(the invasion of the body by
pathogenic microorganisms and their
multiplication which can lead to tissue
damage and disease)}
infection#4 {infection, contagion, transmission
– (an incident in which an infectious disease
is transmitted)}
Maturation
maturation#2 {growth, growing, maturation,
development, ontogeny, ontogenesis – ((biology)
the process of an individual organism growing
organically; a purely biological unfolding of
events involved in an organism changing
gradually from a simple to a more complex level;
he proposed an indicator of osseous
development in children)}
maturation#3 {festering, suppuration, maturation
– (the formation of morbific matter in an abscess
or a vesicle and the discharge of pus)}
But it
mixes up expert and
non-expert vocabulary,
both current and medieval:
suppuration#2 {pus, purulence,
suppuration, ichor, sanies, festering – (a
fluid product of inflammation)}
And it contains medically relevant
errors:
snore-sleep linked via verb entailment: “if someone
snores, then he necessarily also sleeps.”
In medicine: quite possible to snore while awake,
since snoring implies the respiratory induced
vibration of glottal tissues as associated not only
(and most usually) with sleep but also with
relaxation or obesity.
Methodology for constructing MFN will provide us
with a systematic means to detect such errors.
snore  sleep
Constructing MBN will give us the resources
to do justice to the reason why such cases
were included in the first place:
People can only snore when they are asleep
and similar sentences belong precisely to
the folk beliefs about medicine which MBN
will document
Extracting sentences from online
consumer health information sources
In one experiment sentences were derived
by researchers in medical informatics from
factsheets on Airborne allergens in
NIAID’s Health Information Publications
and on Hay fever and perennial allergic
rhinitis in the UK NetDoctor’s Diseases
Encyclopedia.
Source (NIAID)
Output
There is no good
way to tell the difference
between allergy
symptoms of runny
nose, coughing, and
sneezing and cold
symptoms. Allergy
symptoms, however,
may last longer than
cold symptoms.
from NIAID HealthInfo
Allergies have symptoms.
Colds have symptoms.
A runny nose is a symptom of an
allergy.
Coughing is a symptom of an
allergy.
Sneezing is a symptom of an
allergy.
Cold symptoms are similar to
allergy symptoms.
A cold is not an allergy.
Allergy symptoms may last longer
than cold symptoms.
Output sentences
use simple syntax and draw on naturallanguage terms used in original sources
Sentences containing anaphora,
instructions, warnings, … are replaced by
complete statements constructed via
simple syntactic modifications – or
ignored.
Output Sentences
1644 sentences produced (= 20 person hours of
effort)
500 sentences were subjected to a preliminary
evaluation by pairs of medical students (on a
score of 1-5 …)
58% were rated by with a score of 2 x 5
but: measures for inter-rater agreement too low
for these results to be statistically significant.
Validation methods
sources
A: filtering for intelligibility by non-experts
pool
B: filtering for non-expert assent
Medical BeliefNet
C: filtering for validation by
experts
Medical FactNet
Validation methods
sources
filtering for intelligibility by non-experts
pool
filtering for non-expert assent
filtering for validation by experts
This will provide an empirical
delineation of the scope of ‘natural
language’ (non-expert language)
Natural language = language (typical) nonexperts (think they) can understand
Does ‘depillation’ belong to natural
language? ‘suppuration’? ‘auto-immune’?
‘tomograph’? ‘hypertension’? ‘radiologist’?
Method
400 x 250 statements will be rated for
understandability by two participants, making for
a total of 200,000 ratings in response to the
question:
on a scale from 1-5, would you describe this
sentence as hard to understand or easy to
understand?
Raters will be encouraged not to reflect on
successive statements
Only those statements which receive a score of at
least 4 from each of 2 subjects will pass on to
the pool
Validation methods
sources
filtering for intelligibility by non-experts
pool
filtering for non-expert assent
filtering for validation by
experts
Method
Collections of 200 statements from the pool will be
rated for assent by each of 250 participants.
on a scale from 1-5, would you describe this
sentence with the words do not agree at all …
agree completely?
Raters will be encouraged to reflect upon their
answers if necessary
Statements receiving a score of at least 4 from
each of two raters will be stored as components
of Medical BeliefNet (MBN).
Validation methods
sources
filtering for intelligibility by non-experts
pool
filtering for non-expert assent
filtering for validation by
experts
Method
Raters, selected from medical faculty and
advanced medical students, will be subject to a
pre-evaluation as follows.
A set of 40 sentences in the pool will be
validated as true or false by the relevant
specialists
Only those candidate participants with very high
scores in matching these validations will be
selected to serve as raters in the validations of
sentences for MFN.
Method
Rating for MFN will involve no time constraints
raters will be encouraged to use reference works
On a scale from 1-5, how strongly do you believe
this statement?
Only sentences receiveing scores of 5 from each
of two raters will be added to the MFN database.
Thus in relation to those sentences which receive
a score of less than 5, raters will be encouraged
to propose alternative statements, which will be
used as new input to the non-expert phase for
assessment.
Training of expert raters for MFN
will include e.g. guidance as to the treatment of
statements which relate only to what holds for
the most part or in most cases.
people with a cold sometimes sneeze
could mean either: not all people with a cold
sneeze, contradicting the fact that sneezing is a
mandatory symptom for a cold,
or all people with a cold sneeze, but not all the
time, which would be rated as correct.
Evaluation of MWN and MFN
users of a consumer health information portal will
be randomly assigned to one of four groups: 1.
access to the unsupplemented portal; 2. access
also to MWN, 3. access to MFN, 4. access to
both MWN and MFN
then apply Saracevic Kantor method for evaluating
user satisfaction with internet query services
Future work
application of MBN/MFN methodology to
evaluate the reliability of the medical knowledge
of different non-expert communities
by preserving data pertaining to the sources of
entries in MBN it will be possible to keep track of
specific kinds of false beliefs as originating in
specific kinds of informants. This may prove a
valuable source of information in targeting
specific groups for specific types of remedial
medical education.
Future work
experiments in the tradition of E. Rosch to
investigate how the domain of medical
phenomena is conceptualized by non-expert
human subjects ()
Basic level words: tomato, cabbage vs.
bean vs. vegetable (too general) / cherry tomato
(too specific)
what is the basic level of lexical specification in
the domain of medical phenomena?
what are the basic kinds in the ontology of
medicine of natural-language-using subjects?
Different roles of MFN and MBN
MFN associated with constructing practical
tools designed to assist users in coming to
believe what is true
MBN associated with researchregarding
what people believe about medical
phenomena.
Towards a comprehensive assay
of consumer health knowledge
Ultimate goal: to document in an
ontologically coherent fashion the entirety
of the medical knowledge that is capable
of being understood by average adult
consumers of healthcare services in the
United States today.
Just as English WordNet
serves as an interlingual index between
wordnets in different languages,
so MWN and MFN can function as an interontology index between different expert
factnets prepared for different parts of
technical biomedical knowledge
NLM goal of expert medical factnet
ARistOTLE
Aggregative Realist Ontology of Total
Language
The End