Universidad San Sebastián  
 

Repositorio Institucional Universidad San Sebastián

Búsqueda avanzada

Descubre información por...

 

Título

Ver títulos
 

Autor

Ver autores
 

Tipo

Ver tipos
 

Materia

Ver materias

Buscar documentos por...




Mostrar el registro sencillo del ítem

dc.contributor.author Villena, Fabián
dc.contributor.author Bravo-Marquez, Felipe
dc.contributor.author Dunstan, Jocelyn
dc.date.accessioned 2026-02-08T03:28:19Z
dc.date.available 2026-02-08T03:28:19Z
dc.date.issued 2025-12
dc.identifier.issn 1472-6947
dc.identifier.other Mendeley: 641e9f8f-607a-395c-b6f7-982dafde62a7
dc.identifier.uri https://repositorio.uss.cl/handle/uss/20429
dc.description Publisher Copyright: © The Author(s) 2025.
dc.description.abstract Background: Clinical decision-making in healthcare often relies on unstructured text data, which can be challenging to analyze using traditional methods. Natural Language Processing (NLP) has emerged as a promising solution, but its application in clinical settings is hindered by restricted data availability and the need for domain-specific knowledge. Methods: We conducted an experimental analysis to evaluate the performance of various NLP modeling paradigms on multiple clinical NLP tasks in Spanish. These tasks included referral prioritization and referral specialty classification. We simulated three clinical settings with varying levels of data availability and evaluated the performance of four foundation models. Results: Clinical-specific pre-trained language models (PLMs) achieved the highest performance across tasks. For referral prioritization, Clinical PLMs attained an 88.85 % macro F1 score when fine-tuned. In referral specialty classification, the same models achieved a 53.79 % macro F1 score, surpassing domain-agnostic models. Continuing pre-training with environment-specific data improved model performance, but the gains were marginal compared to the computational resources required. Few-shot learning with large language models (LLMs) demonstrated lower performance but showed potential in data-scarce scenarios. Conclusions: Our study provides evidence-based recommendations for clinical NLP practitioners on selecting modeling paradigms based on data availability. We highlight the importance of considering data availability, task complexity, and institutional maturity when designing and training clinical NLP models. Our findings can inform the development of effective clinical NLP solutions in real-world settings. en
dc.language.iso eng
dc.relation.ispartof vol. 25 Issue: no. 1 Pages:
dc.source BMC Medical Informatics and Decision Making
dc.title NLP modeling recommendations for restricted data availability in clinical settings en
dc.type Artículo
dc.identifier.doi 10.1186/s12911-025-02948-2
dc.publisher.department Facultad de Odontología


Ficheros en el ítem

Ficheros Tamaño Formato Ver

No hay ficheros asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem