2026_Data Integration Using Native Javascript Object Notation (Njson) Model for Semi-Structured Data Format

Bibliographic Details
Format: General Document
_version_ 1860798361305612288
building INTELEK Repository
collection Online Access
collectionurl https://intelek.unisza.edu.my/intelek/pages/search.php?search=!collection8802
copyright Copyright©PWB2026
country Malaysia
date 2025-03-20
format General Document
id 17461
institution UniSZA
originalfilename 17461_24daba57afcf4ce.pdf
person Nurul Anis Alia Ahmad Shah
recordtype oai_dc
resourceurl https://intelek.unisza.edu.my/intelek/pages/view.php?ref=17461
sourcemedia Server storage
Scanned document
spelling 17461 https://intelek.unisza.edu.my/intelek/pages/view.php?ref=17461 https://intelek.unisza.edu.my/intelek/pages/search.php?search=!collection8802 General Document Malaysia Library Staff (Top Management) Library Staff (Management) Library Staff (Support) Terengganu Faculty of Informatics & Computing English application/pdf 1.6 Public Access Server storage Scanned document Universiti Sultan Zainal Abidin Universiti Sultan Zainal Abidin 191 Dissertations, Academic Performance—Evaluation Performance Evaluation Adobe PDF Library 21.1.167 Copyright©PWB2026 Thesis Data integration has been widely studied, especially regarding its diverse approaches. Data integration is an important process in many different contexts, including the commercial and scientific spheres. Data integration is the process of combining information from various data sources into a single, unified view. A unified view is needed to ensure different applications can extract the data and dump it into their applications. Two processes are involved in data integration: data extraction and conversion. Recent researchers have proved that data extraction and conversion play an essential role in data integration. Data extraction is the process of taking data from data sources for later processing or storage. It combines many data sources and extracts them into various data formats. Meanwhile, data conversion is the process of converting data into a universal format. Both processes allow for the extraction of data from different data sources and provide a single, unified view in a universal format. A few approaches have been used in data integration, such as Extensible Markup Language (XML) and JavaScript Object Notation (JSON). XML is one of the data formats that is supported by data reading. The development and formulation models for data extraction in data integration are: (1) map to data sources; (2) extract data sources; (3) dump data; and (4) generate a data model. A specified data model will be generated for data retrieval purposes. Three different datasets will be used: NASA, SigmodRecord, and DBLP. Data insertion response time will be measured based on the data extraction process from the XML file, dumping it into the DBMS, and generating a new Native XML (NXD) or Native JSON (NJSON). Then, query processing response time will be measured and evaluated based on queries of different complexity. Three datasets—SigmodRecord, NASA, and DBLP—are used. Experiment 1 includes data conversion and extraction from three datasets into NXD and NJSON. Data insertion response time has been computed four times so that the average data insertion response time of NXD and NJSON can be displayed. By using Formula 5.1, NJSON is faster by 9.10% in SigmodRecord, 9.11% in NASA, and 17.45% in DBLP compared to NXD. Experiment 2 involves query processing response time from three datasets with three complexities: I, II, and III. Query processing response time has been computed four times so that the average query processing response time of NXD and NJSON can be displayed. By using Formula 5.2 for each data model in every benchmark dataset, NJSON can reduce response time by 8.87% (complexity I), 9.30% (complexity II), and 4.65% (complexity III) in SigmodRecord; 9.05% (complexity I), 8.96% (complexity II), and 3.07% (complexity III) in NASA; and 14.70% (complexity I), 7.85% (complexity II), and 5.97% (complexity III) in DBLP compared to NXD. Performance analysis includes Precision, Recall, and F-Measure from three datasets with three complexities. NJSON is more accurate compared to NXD because it can handle large amounts of data, especially semi-structured data, which boosts irrelevant data extraction and valuable data retrieval processes. NXD and NJSON are two approaches that have been reviewed in this research for data integration. The NJSON approach is proposed by combining NXD and NJSON elements to produce better performance in data integration. SigmodRecord, NASA, and DBLP datasets have been used for experimental purposes. Two experiments have been conducted. NJSON indicated better performance in terms of data insertion response time and query processing response time compared to NXD. In this case, NJSON has proven efficient and reliable enough to become an alternative for data integration. Nurul Anis Alia Ahmad Shah Data Integration Native JavaScript Object Notation (NJSON) Native XML (NXD) Semi-Structured Data Data Extraction Data Conversion XML JSON Query Processing Data Insertion Response Time Database Management Systems (DBMS) Data Retrieval Benchmark Datasets Data integration (Computer Science) Database Management Information Retrieval 2026_Data Integration Using Native Javascript Object Notation (Njson) Model for Semi-Structured Data Format 2025-03-20 uuid:acb5b657-3bba-4e4b-8752-2b1bc09c85ec 17461_24daba57afcf4ce.pdf
spellingShingle 2026_Data Integration Using Native Javascript Object Notation (Njson) Model for Semi-Structured Data Format
state Terengganu
subject Dissertations, Academic
Performance—Evaluation
Data integration (Computer Science)
Database Management
Information Retrieval
summary Data integration has been widely studied, especially regarding its diverse approaches. Data integration is an important process in many different contexts, including the commercial and scientific spheres. Data integration is the process of combining information from various data sources into a single, unified view. A unified view is needed to ensure different applications can extract the data and dump it into their applications. Two processes are involved in data integration: data extraction and conversion. Recent researchers have proved that data extraction and conversion play an essential role in data integration. Data extraction is the process of taking data from data sources for later processing or storage. It combines many data sources and extracts them into various data formats. Meanwhile, data conversion is the process of converting data into a universal format. Both processes allow for the extraction of data from different data sources and provide a single, unified view in a universal format. A few approaches have been used in data integration, such as Extensible Markup Language (XML) and JavaScript Object Notation (JSON). XML is one of the data formats that is supported by data reading. The development and formulation models for data extraction in data integration are: (1) map to data sources; (2) extract data sources; (3) dump data; and (4) generate a data model. A specified data model will be generated for data retrieval purposes. Three different datasets will be used: NASA, SigmodRecord, and DBLP. Data insertion response time will be measured based on the data extraction process from the XML file, dumping it into the DBMS, and generating a new Native XML (NXD) or Native JSON (NJSON). Then, query processing response time will be measured and evaluated based on queries of different complexity. Three datasets—SigmodRecord, NASA, and DBLP—are used. Experiment 1 includes data conversion and extraction from three datasets into NXD and NJSON. Data insertion response time has been computed four times so that the average data insertion response time of NXD and NJSON can be displayed. By using Formula 5.1, NJSON is faster by 9.10% in SigmodRecord, 9.11% in NASA, and 17.45% in DBLP compared to NXD. Experiment 2 involves query processing response time from three datasets with three complexities: I, II, and III. Query processing response time has been computed four times so that the average query processing response time of NXD and NJSON can be displayed. By using Formula 5.2 for each data model in every benchmark dataset, NJSON can reduce response time by 8.87% (complexity I), 9.30% (complexity II), and 4.65% (complexity III) in SigmodRecord; 9.05% (complexity I), 8.96% (complexity II), and 3.07% (complexity III) in NASA; and 14.70% (complexity I), 7.85% (complexity II), and 5.97% (complexity III) in DBLP compared to NXD. Performance analysis includes Precision, Recall, and F-Measure from three datasets with three complexities. NJSON is more accurate compared to NXD because it can handle large amounts of data, especially semi-structured data, which boosts irrelevant data extraction and valuable data retrieval processes. NXD and NJSON are two approaches that have been reviewed in this research for data integration. The NJSON approach is proposed by combining NXD and NJSON elements to produce better performance in data integration. SigmodRecord, NASA, and DBLP datasets have been used for experimental purposes. Two experiments have been conducted. NJSON indicated better performance in terms of data insertion response time and query processing response time compared to NXD. In this case, NJSON has proven efficient and reliable enough to become an alternative for data integration.
title 2026_Data Integration Using Native Javascript Object Notation (Njson) Model for Semi-Structured Data Format
title_full 2026_Data Integration Using Native Javascript Object Notation (Njson) Model for Semi-Structured Data Format
title_fullStr 2026_Data Integration Using Native Javascript Object Notation (Njson) Model for Semi-Structured Data Format
title_full_unstemmed 2026_Data Integration Using Native Javascript Object Notation (Njson) Model for Semi-Structured Data Format
title_short 2026_Data Integration Using Native Javascript Object Notation (Njson) Model for Semi-Structured Data Format
title_sort 2026_data integration using native javascript object notation (njson) model for semi-structured data format