Intelligent Healthcare Based on NLG Algorithms: Application Scenarios and Construction Experience

baoshi.rao

In the near future, customized NLP systems will remain one of the key solutions to help the large-scale intelligent healthcare industry achieve business goals and serve as a primary battleground for AI implementation.

Artificial intelligence is currently in a highly active phase, and the field of natural language processing (NLP) has also been thriving for a decade. In complex tasks such as reading comprehension, language translation, and creative writing, computers will perform as well as humans. Language understanding capabilities benefit from free deep learning libraries (such as language models like Pytext and BERT), big data (Hadoop, Spark, Spark NLP), and cloud computing (providing GPU and NLP services from providers).

Currently, companies working in the NLP field include Tencent, iFlytek, Microsoft, AISpeech, Huawei, and others.

In the healthcare sector, some applications have moved from science fiction to reality. AI systems have passed medical licensing exams in China and the UK, outperforming average doctors. The latest systems can diagnose 55 pediatric diseases more accurately than junior doctors.

However, these systems are harder to build than some of the first computer vision deep learning applications (e.g., analyzing an image) because they require broader medical knowledge, handle more diverse inputs, and must understand context.

I have been fortunate to participate in building NLP systems for healthcare. This article aims to share some of the knowledge I’ve gained, hoping to help you build similar systems faster and better.

Natural language processing consists of natural language understanding (NLU) and natural language generation (NLG). NLG is the computer's "writing language," converting structured data into human-readable text. It automatically generates high-quality natural language text based on key information and its internal machine representation through a planning process.

Today’s massive data volumes are beyond human processing capacity; NLG humanizes data to assist people. NLG systems use data analysis and AI techniques to analyze complex datasets and employ computational linguistics to communicate findings in high-quality written explanations.

How NLG works: It takes abstract propositions as input, performs semantic and syntactic analysis on the natural language input, organizes the language, and generates text that closely matches the desired output.

Example: Input "Madonna," and NLG generates: Madonna "singer."

NLG can help patients understand their health conditions and make better healthcare choices. It also assists patients in self-care, including lifestyle changes, managing chronic diseases, and adhering to treatment plans.

For instance, many diabetics use sensors to measure blood sugar levels but struggle to interpret the data, often overreacting to fluctuations. NLG systems can explain and contextualize blood sugar changes, helping diabetics respond appropriately.

Clinicians are particularly enthusiastic about automated reporting tools for two reasons: automation saves time and reduces errors, omissions, and ensures data consistency.

I have worked on several systems in this field, primarily generating handover reports (e.g., nursing shift handovers, first responders transferring to medical staff) and am aware of many other NLG projects in this domain.

I believe NLG has significant potential for clinical decision support. There is substantial evidence that clinicians’ current data interpretation methods (via visualizations or tables) are sometimes ineffective. Text summaries can highlight critical information not visible in visualizations, aiding decision-making. In fact, automated report generation is a stronger selling point than clinical decision support.

Most importantly, NLG can enhance patients’ understanding of their conditions and support them in making better treatment decisions.

In practice, off-the-shelf NLP libraries and algorithms for Chinese face various challenges in the healthcare industry’s "unique language." Not only do named entity recognition or entity resolution models fail, but even basic tasks like tokenization, part-of-speech tagging, and sentence segmentation are ineffective for most medical sentences.

Moreover, the healthcare industry has hundreds of "languages." Avoid building a universal medical NLP system. The reality is that each subspecialty and its communication forms are fundamentally different, making a one-size-fits-all approach impossible.

Additionally, each medical specialty has many variations. For example, pre-authorization requests for MRI versus implantable spinal stimulators require entirely different criteria. Another example is the use of distinct terminology for different cancers in pathology.

These issues have real-world implications: My company is working on a project requiring different NLP models to extract facts about lung, breast, and colon cancers from pathology reports.

So far, Amazon’s Comprehend Medical only focuses on normalizing drug values (see the "aspirin" example above). The service also offers standard medical named entity recognition but falls short of meeting specific application needs.

I tested several popular NLP cloud services. In one test, the only medical term recognized by two out of six engines was "Tylenol" as a product.

This highlights how "medical language" differs from general human language.

Here are some projects we’ve worked on:

1) Deep Learning-Based Sentence Segmentation

While segmenting Wikipedia articles often requires only regex, processing multi-page clinical documents is far more challenging. Algorithms must handle headers, footers, lists, enumerations, annotations, two-column formats, and other layout issues.

2) Medical-Specific Part-of-Speech Tagging

Not only are different models needed, but additional part-of-speech tags are used for medical models. This improves the accuracy of medical named entity recognition.

3) Medical-Specific Normalization Algorithms

In practice, named entity recognition alone is often useless. Identifying "eyes" and "infection" in "both eyes appear infected" as medical terms isn’t helpful. Instead, tagging the entire text block with a standard SNOMED-CT code (e.g., 312132001) and normalizing variations of the same finding is far more useful. This allows applications to build business logic based on the code, regardless of how it was expressed in the original text.

One approach to building an AI system is to start with annotated validation datasets. For example, if automating outpatient case coding to ICD-10, have clinicians define representative samples, anonymize them, and let professional coders annotate (assign correct codes).

If extracting key events from radiology reports or identifying overlooked safety events in patient records, first have clinicians define and annotate samples.

Side note: You’ll see job postings for "data annotators" at major AI companies. Salaries vary widely by industry—healthcare annotators earn more, and many are part-time.

This approach often uncovers pitfalls before involving data science teams (and wasting time). If you can’t obtain enough data or anonymize it at scale, reliable models can’t be built.

If clinicians can’t agree on annotations, the first step is to align on clinical guidelines rather than asking data scientists to automate inconsistency.

Finally, if you find yourself dealing with highly imbalanced classes (such as when studying conditions that affect only a few people annually), it may be wise to redefine the problem before bringing in data scientists.

The labeled validation set and dataset aim to use standard libraries or cloud services to determine the highest accuracy they can achieve in meeting users' specific needs.

This approach allows evaluation of the difficulty level for each service, including: training custom models, defining domain-specific features, required pipeline steps for the solution, and explaining results to clients.

Once you have a representative, agreed-upon, and properly labeled validation set, you can begin testing existing libraries and cloud service providers. It's highly likely that this testing will immediately reveal gaps between each product's capabilities and your requirements.

In this article, I've provided a brief introduction to product design considerations for customized NLP medical services, starting from the deconstruction of healthcare business frameworks.

For the foreseeable future, business-customized NLP systems will remain one of the crucial systems that can genuinely help large-scale smart healthcare industries achieve their business objectives, and will continue to be a primary battleground for AI implementation.