Automated UML class diagram generation from textual requirements using NLP techniques

Translating textual requirements into precise Unified Modeling Language (UML) class diagrams poses challenges due to the unstructured and often ambiguous nature of text, which can lead to inconsistencies and misunderstandings during the initial stages of software development. Current methods often s...

Full description

Bibliographic Details
Main Authors: Meng, Yang, Ban, Ainita
Format: Article
Language:English
Published: Politeknik Negeri Padang 2024
Online Access:http://psasir.upm.edu.my/id/eprint/114504/
http://psasir.upm.edu.my/id/eprint/114504/1/114504.pdf
_version_ 1848866513565515776
author Meng, Yang
Ban, Ainita
author_facet Meng, Yang
Ban, Ainita
author_sort Meng, Yang
building UPM Institutional Repository
collection Online Access
description Translating textual requirements into precise Unified Modeling Language (UML) class diagrams poses challenges due to the unstructured and often ambiguous nature of text, which can lead to inconsistencies and misunderstandings during the initial stages of software development. Current methods often struggle with effectively addressing these challenges due to limitations in handling diverse and complex textual requirements, which may result in incomplete or inaccurate UML diagrams. This study aims to propose a Natural Language Processing (NLP) model that analyzes and comprehends textual requirements to extract relevant information for generating UML class diagrams, ensuring accuracy and consistency between the diagrams and requirement descriptions. The research employs a four-step approach: preprocessing to handle text noise and redundancy, sentence classification to distinguish between "class" and "relationship" sentences, syntactic analysis to examine grammatical structures, and UML class diagram generation based on predefined rules. The results show that the model achieved a classification accuracy of 88.46% with a high Area Under the Curve (AUC) value of 0.9287, indicating robust performance in distinguishing between class definitions and relationships. This study highlights that existing methods may not fully address the nuances of translating complex textual requirements into accurate UML diagrams. This study successfully demonstrates an automated method for generating UML class diagrams from textual requirements and suggests that future research could expand datasets, optimize feature extraction, explore advanced models, and develop automated rule generation methods for further improvements.
first_indexed 2025-11-15T14:21:48Z
format Article
id upm-114504
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T14:21:48Z
publishDate 2024
publisher Politeknik Negeri Padang
recordtype eprints
repository_type Digital Repository
spelling upm-1145042025-01-22T02:48:16Z http://psasir.upm.edu.my/id/eprint/114504/ Automated UML class diagram generation from textual requirements using NLP techniques Meng, Yang Ban, Ainita Translating textual requirements into precise Unified Modeling Language (UML) class diagrams poses challenges due to the unstructured and often ambiguous nature of text, which can lead to inconsistencies and misunderstandings during the initial stages of software development. Current methods often struggle with effectively addressing these challenges due to limitations in handling diverse and complex textual requirements, which may result in incomplete or inaccurate UML diagrams. This study aims to propose a Natural Language Processing (NLP) model that analyzes and comprehends textual requirements to extract relevant information for generating UML class diagrams, ensuring accuracy and consistency between the diagrams and requirement descriptions. The research employs a four-step approach: preprocessing to handle text noise and redundancy, sentence classification to distinguish between "class" and "relationship" sentences, syntactic analysis to examine grammatical structures, and UML class diagram generation based on predefined rules. The results show that the model achieved a classification accuracy of 88.46% with a high Area Under the Curve (AUC) value of 0.9287, indicating robust performance in distinguishing between class definitions and relationships. This study highlights that existing methods may not fully address the nuances of translating complex textual requirements into accurate UML diagrams. This study successfully demonstrates an automated method for generating UML class diagrams from textual requirements and suggests that future research could expand datasets, optimize feature extraction, explore advanced models, and develop automated rule generation methods for further improvements. Politeknik Negeri Padang 2024 Article PeerReviewed text en cc_by_sa_4 http://psasir.upm.edu.my/id/eprint/114504/1/114504.pdf Meng, Yang and Ban, Ainita (2024) Automated UML class diagram generation from textual requirements using NLP techniques. International Journal on Informatics Visualization, 8 (3-2). pp. 1905-1915. ISSN 2549-9610; eISSN: 2549-9904 https://joiv.org/index.php/joiv/article/view/3482 10.62527/joiv.8.3-2.3482
spellingShingle Meng, Yang
Ban, Ainita
Automated UML class diagram generation from textual requirements using NLP techniques
title Automated UML class diagram generation from textual requirements using NLP techniques
title_full Automated UML class diagram generation from textual requirements using NLP techniques
title_fullStr Automated UML class diagram generation from textual requirements using NLP techniques
title_full_unstemmed Automated UML class diagram generation from textual requirements using NLP techniques
title_short Automated UML class diagram generation from textual requirements using NLP techniques
title_sort automated uml class diagram generation from textual requirements using nlp techniques
url http://psasir.upm.edu.my/id/eprint/114504/
http://psasir.upm.edu.my/id/eprint/114504/
http://psasir.upm.edu.my/id/eprint/114504/
http://psasir.upm.edu.my/id/eprint/114504/1/114504.pdf