Recuperación automática de Definiciones de Conceptos Técnicos
Fecha
2024-07-11
Autores
Título de la revista
ISSN de la revista
Título del volumen
Editor
Jaén: Universidad de Jaén
Resumen
La extracción de definiciones en procesamiento del lenguaje natural se enfrenta al reto de identificar el
patrón inespecífico de la definición de un concepto presente en múltiples documentos. Se propone un
sistema basado en recuperación generativa aumentada (RAG) que utiliza grandes modelos de lenguaje
para buscar y consolidar definiciones. Evaluando tres modelos (GPT-3.5-turno, Mistral-7B-Instruct y
Gemma-7B) en artículos científicos, se concluye que GPT-3.5 es el más efectivo para esta tarea.
Definition extraction in natural language processing faces the challenge of identifying the unspecific pattern of a concept definition present in multiple documents. A system based on Retrieval Augmented Generation (RAG) that uses large language models to search and consolidate definitions is proposed in this article. By evaluating three models (GPT-3.5-turbo, Mistral-7B-Instruct and Gemma-7B) in scientific papers, it is concluded that GPT-3.5 is the most effective for this task.
Definition extraction in natural language processing faces the challenge of identifying the unspecific pattern of a concept definition present in multiple documents. A system based on Retrieval Augmented Generation (RAG) that uses large language models to search and consolidate definitions is proposed in this article. By evaluating three models (GPT-3.5-turbo, Mistral-7B-Instruct and Gemma-7B) in scientific papers, it is concluded that GPT-3.5 is the most effective for this task.