Go to main contents Go to main menus

Archives

contents area

detail content area

A Study of data quality management in the National Biobank of Korea
  • Date2018-02-28 22:36
  • Update2018-02-28 22:36
  • DivisionDivision of Strategic Planning for Emerging Infectious Diseases
  • Tel043-719-7271
A Study of data quality management in the National Biobank of Korea

Ji Byeonggon, Lee Sang-Hyeop, Jeon Jae-Pil
Division of Biobank for Health Sciences, Center for Genome Science, KNIH, KCDC

Background: Most biobanks put in manually biospecimen-related inventory data into the databases, which may introduce errors in the database. The National Biobank of Korea (NBK) operates on the self-developed Human Biobank Information System (HuBIS) which stores and disseminates data of biological samples.
Methodology/Results: The HuBIS handles various biobank inventory data generated from a central biobank and 17 regional biobanks, which form the Korea Biobank Network (KBN). Here, we report the analysis of the data quality and database structure of the HuBIS with an aim to improve the database quality and reliability. The HuBIS database was analyzed for patterns of data errors in terms of 12 assessment areas, including uniqueness and column consistency according to the database quality certification-value (DQC-V) of the Korea Data Agency. The result of the analysis indicated that the error rates for uniqueness and column consistency were 0.17% and 3.3% respectively, showing 3.04% error rate of the total evaluation standard, which is similar to 3.2, the error rate of a Sigma level and the Silver class of the DQC-V standard. In addition, we analyzed the entity relationship diagram (ERD) of the database structure, and demonstrated that data quality can be efficiently increased by improving data normalization.
Conclusion: Based on the assessment of database quality, we will apply for a data quality certification of the Korea Data Agency, and will implement the 5-year roadmap of data quality management of the HuBIS.

Keywords: Biobank, Data quality management, Database quality, Roadmap
This public work may be used under the terms of the public interest source + commercial use prohibition + nonrepudiation conditions This public work may be used under the terms of the public interest source + commercial use prohibition + nonrepudiation conditions
TOP