Large Language Models as Data Interfaces for Health Applications by instituteforexperientialai

Large Language Models as Data Interfaces for Health Applications

Silvio Amir

s.amir@northeastern.edu

Which Large Language Model?

Pretrain + Fine-tune

https://arxiv.org/abs/2304.13712

Which Large Language Model?

Pretrain + Prompt

Pretrain + Fine-tune

https://arxiv.org/abs/2304.13712

Which Large Language Model?

Pretrain +

Instruction Fine-tune +

Prompt/Fine-tune

Pretrain + Prompt

Pretrain + Fine-tune

https://arxiv.org/abs/2304.13712

Which Large Language Model?

Pretrain +

Instruction Fine-tune +

RL from Human Feedback +

Dialogue

Pretrain +

Instruction Fine-tune +

Prompt/Fine-tune

Pretrain + Prompt

Pretrain + Fine-tune

https://arxiv.org/abs/2304.13712

Which Large Language Model?

Pretrain +

Instruction Fine-tune +

RL from Human Feedback +

Dialogue

Pretrain +

Instruction Fine-tune +

Prompt/Fine-tune

Pretrain + Prompt

Pretrain + Fine-tune

https://arxiv.org/abs/2304.13712

Which Large Language Model?

Flan-T5

Large Language Models

One generative model to rule them all

Solve NLP problems with minimal supervision

Help users perform real-world tasks

Large Language Models

Can LLMs replace knowledge workers?

Large Language Models

LLMs can generate false and harmful outputs

Hallucinations

Toxicity

Bias

•

LLMs and Human-centered AI

How can we use LLMs to replace empower human experts?

LLMs and Human-centered AI

How can we use LLMs to replace empower human experts?

Derive insights from large and complex datasets

•

LLMs and Human-centered AI

How can we use LLMs to replace empower human experts? • Derive insights from large and complex datasets

• Improve data-driven decision-making

LLMs as Conversational Data Interfaces

Surface key pieces of information from unstructured data

Answer complex questions about the data

LLMs as Conversational Data Interfaces

Answers must be grounded in factual data

LLMs as Conversational Data Interfaces

Answers must be grounded in factual data

Abstractive + extractive generation

LLMs as Conversational Data Interfaces

Answers must be grounded in factual data

Abstractive + extractive generation

Can we frame extraction tasks as generation?

- Relation Extraction

- Structured Evidence Inference

LLMs as Conversational Data Interfaces

Evidence Based Medicine

What is the most effective treatment for condition X?

Public Health

What are the main health concerns of population Z?

RCT Reports

Social Media

Healthcare

Is there any evidence that the patient may be suffering from Y?

Clinical Notes

LLM

Relation Extraction

Identify key entities, their types and relations in text

Relation Extraction

Identify key entities, their types and relations in text

The diagnosis of hypothermia was delayed until it was apparent for several days but resolved with the discontinuation of risperidone and continuation of clozapine

Meanwhile, Shi Liming at the Institute of Zoology of Kunming found that pandas lack variety in their protein heredity, which may serve as one of the major reasons for pandas’ near extinction.

Relation Extraction

Identify key entities, their types and relations in text

The diagnosis of hypothermia:effect was delayed until it was apparent for several days but resolved with the discontinuation of risperidone:drug and continuation of clozapine:drug

Meanwhile, Shi Liming:per at the Institute of Zoology:org of Kunming:loc found that pandas lack variety in their protein heredity, which may serve as one of the major reasons for pandas’ near extinction.

Relation Extraction

Identify key entities, their types and relations in text

The diagnosis of hypothermia:effect was delayed until it was apparent for several days but resolved with the discontinuation of risperidone:drug and continuation of clozapine:drug

Structured Prediction

Risperidone: Drug adverse effect

Hypothermia: Effect

Shi Liming: Per work for Institute of Zoology: Org

Institute of Zoology: Org Org based in Kunming: Loc

Relation Extraction: Pipeline

The diagnosis of hypothermia was delayed until it was apparent for several days but resolved with the discontinuation of risperidone and continuation of clozapine

hypothermia:effect risperidone:drug

clozapine:drug

NER

RE adverse ?

Relation Extraction: End-to-End

The diagnosis of hypothermia was delayed until it was apparent for several days but resolved with the discontinuation of risperidone and continuation of clozapine

risperidone:drug

adverse hypothermia:effect

Relation Extraction: Conditional Generation

We frame RE as conditional generation task

Relation Extraction: Conditional Generation

We frame RE as conditional generation task

Targets are linearized strings

[(drug, effect), ... ,(drug, effect)]

[(entity_1:type, relation_type, entity_2:type),..., (entity_1:type, relation_type, entity_2:type)]

Relation Extraction: Conditional Generation

List all (drug: adverse effects) pairs in the following text:

[Text]

[[Drug, Adverse Effect], …, [Drug, Adverse Effect]]

List all (drug: adverse effects) pairs in the following text:

[Text]

[[Drug, Adverse Effect], …, [Drug, Adverse Effect]]

…

[[Drug, Adverse Effect], …, [Drug, Adverse Effect]]

[Text] LLM

List all (drug: adverse effects) pairs in the following text:

1. Construct few-shot prompts 2. Use GPT-3 to generate linearized target outputs

Relation Extraction: Conditional Generation

Evaluating generative models for extraction with exact matching is tricky

Open-ended generation can result in

• Non-conforming outputs

• Wrong False Positives

• semantically similar but lexically different outputs

• non-exhaustive annotations

Relation Extraction: Conditional Generation

We also found instances of incorrectly labeled examples

Relation Extraction: Conditional Generation

Relation Extraction: Conditional Generation

Augmenting prompts with chain-of-thought style explanations

Relation Extraction: Conditional Generation

Augmenting prompts with chain-of-thought style explanations

• Improves performance

• Reduces non-conforming outputs

Relation Extraction: Conditional Generation

Relation Extraction: Conditional Generation

Fine-tune Flan-T5 with explanations generated by GPT-3

Relation Extraction: Conditional Generation

LLMs as Conversational Data Interfaces

Evidence Based Medicine

What is the most effective treatment for condition X?

Public Health

What are the main health concerns of population Z?

RCT Reports

Social Media

Healthcare

Is there any evidence that the patient may be suffering from Y?

Clinical Notes

LLM

Evidence Based Medicine

Use the best available scientific evidence to support medical decisions

• Randomized Control Trials and Systematic Reviews are the gold standard

• New reports are published daily

• Difficult to stay up-to-date

Evidence Based Medicine

Evidence Based Medicine

Structured Evidence Inference Task

RCTs can describe multiple populations, interventions, and outcomes

1. Extract all the [intervention, comparator, outcome] tuples

2. For each ICO tuple: infer the effects of interventions and evidence

Structured Evidence Inference Task

Structured Evidence Inference Task

Can we frame the evidence inference task as generation?

Linearized Targets

[zinc sulfate capsules, placebo, warts, warts resolved in 68% of the patients in treatment group and 64% of the patients in placebo group, no significant difference]

[zinc sulfate capsules, placebo, recurrence of warts, three patients in treatment group and six patients in placebo group had a recurrence of warts (p=.19), no significant difference]

Structured Evidence Inference: Pipeline

SOTA methods for this task used a pipeline approach

Structured Evidence Inference: End-to-End

SOTA

methods for this task used a pipeline approach

We fine-tuned Flan-T5 on pairs of abstracts and linearized targets

Structured Evidence Inference

Structured Evidence Browser

LLMs as Conversational Data Interfaces

Evidence Based Medicine

What is the most effective treatment for condition X?

Public Health

What are the main health concerns of population Z?

RCT Reports

Social Media

Healthcare

Is there any evidence that the patient may be suffering from Y?

Clinical Notes

LLM

Extracting Medical Claims from Social Media

People use social media to discuss personal health and medical conditions

• Ask questions

• Seek advice

• Share experiences

Extracting Medical Claims from Social Media

People use social media to discuss personal health and medical conditions

• Ask questions

• Seek advice

• Share experiences

The unvetted nature of social media makes it vulnerable to

• Misinformation

• Disinformation

Extracting Medical Claims from Social Media

Identify health related conversations on social media (Reddit)

1. Given a post, extract spans corresponding to:

• Personal experiences

• Questions

• Claims: suggests a causal relationship between an Intervention and an Outcome

2. Given a post and a claim, extract the PICO elements

• Population

• Intervention/Comparators

• Outcomes

Reddit Health Online Talk (RedHOT)

• Data from 24 subreddits

• Annotations from mTurk

Reddit Health Online Talk (RedHOT)

SemEval 2023 shared task

We released a subset of the corpus for a shared task at SemEval 2023

This is a challenging task!

RedHOT Application: Content Moderation

Given a claim and PICO retrieve trustworthy evidence to support/refute claim

RedHOT Application: Content Moderation

Given a claim and PICO retrieve trustworthy evidence to support/refute claim

RedHOT Application: Content Moderation

Dense Passage Retrieval model to retrieve relevant RCTs from Trialstreamer

Dense Passage Retrieval

1. Encode documents and queries as dense vectors

2. Relevance as similarity between query vectors x and document vectors d

RedHOT Application: Content Moderation

Dense Passage Retrieval model to retrieve relevant RCTs from Trialstreamer

Dense Passage Retrieval

Model trained with negative sampling

RedHOT Application: Content Moderation

Dense Passage Retrieval model to retrieve relevant RCTs from Trialstreamer

Challenges

• no labeled data

mismatch between language from social media and RCTs

Pseudo-labeled data

Replace PIO elements from claims with elements sampled from RCTs

•

RedHOT Application: Content Moderation

RedHOT Application: Content Moderation

RedHOT Application: Content Moderation

Relevance judgements from medical doctors

• 100 claims

10 abstracts/claim

•

LLMs as Conversational Data Interfaces

Evidence Based Medicine

What is the most effective treatment for condition X?

Public Health

What are the main health concerns of population Z?

RCT Reports

Social Media

Healthcare

Is there any evidence that the patient may be suffering from Y?

Clinical Notes

LLM

Evidence Extraction to Aid Diagnosis

Can we use LLMs to aid physicians in diagnosis?

• retrieve evidence to support a potential diagnosis

• suggest alternative diagnoses given evidence

Evidence Extraction to Aid Diagnosis

Read the following clinical note of a patient: [NOTE].

Question: Is the patient at risk of developing sepsis ?

Choice -Yes -No.

Answer:

Read the following clinical note of a patient: [NOTE].

Extract evidence for infection

Answer:

bld cx grew streptococcus

LLM Yes

LLM