中国寄生虫学与寄生虫病杂志

• 论著 • 上一篇    下一篇

11种主要人体寄生虫虫卵的数字化描述及自动识别研究

沈海默,艾琳,蔡玉春,卢艳,陈韶红*   

  1. 中国疾病预防控制中心寄生虫病预防控制所,世界卫生组织热带病合作中心,科技部国家级热带病国际联合研究中心,卫生部寄生虫病原与媒介生物学重点实验室,上海200025
  • 出版日期:2016-10-30 发布日期:2016-11-09

Digital Description and Identification of 11 Kinds of Principal Parasite Eggs

SHEN Hai-mo, AI Lin, CAI Yu-chun, LU Yan, CHEN Shao-hong*   

  1. National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention;WHO Collaborating Centre for Tropical Diseases;National Center for International Research on Tropical Diseases, Ministry of Science and Technology;Key Laboratory of Parasite and Vector Biology, Ministry of Health, Shanghai 200025, China
  • Online:2016-10-30 Published:2016-11-09

摘要: 目的 借助计算机技术协助进行寄生虫虫卵的识别,建立适应自动化仪器中临床应用的虫卵分类器算法和相应流程。 方法 选择华支睾吸虫(Clonorchis sinensis)、猪带绦虫(Taenia solium)、蛲虫(Enterobius vermicularis)、蛔虫(Ascaris lumbricoides)、鞭虫(Trichuris trichiura)、曼氏迭宫绦虫(Spirometra mansoni)、阔节裂头绦虫(Diphyllobothrium latum)、十二指肠钩虫(Ancylostoma duodenale)、日本血吸虫(Schistosoma japonicum)、卫氏并殖吸虫(Paragonimus westermani)和布氏姜片吸虫(Fasciolopsis buski)等11种寄生虫的虫卵,分为训练组和测试组进行显微摄影,并使用基于VC++技术进行特征值提取。构建特征值数据库,使用多种分类算法对训练组数据库进行测试,选取分类效率最高的方法构建分类器,建立基于多特征融合的识别方法。 结果 获取了11种寄生虫虫卵图像,去除无法识别或含无效值的图片后,训练数据组虫卵图片为19 844张,测试组为3 721张。对虫卵的14种特征值进行采集,发现11种虫卵的大小、颜色均有显著差异。如11种虫卵中体积最小的华支睾吸虫虫卵的长度、宽度、面积、亮度的均值分别为292.24 μm、192.64 μm、43 416.61 μm2、53.84,而体积最大的布氏姜片吸虫虫卵则分别为945.31 μm、610.88 μm、536 002.60 μm2、100.54。在多特征融合检索时用动态生成权值的方法建立分类器,对训练样本集的区分率为88.89%(17 641/19 844),该分类器对测试样本集的识别率为91.83%(3 004/3 271),平均建模时间为0.01 s。 结论 建立了基于特征值融合方法的寄生虫虫卵分类器算法及相应流程,为其可行性的进一步研究打下了基础。

关键词: 寄生虫虫卵, 多特征识别, 特征值融合, 分类算法

Abstract: Objective To facilitate the identification of parasite eggs using computer technology, establish the automation-based applications, and propose an algorithm for egg classification. Methods Eggs of 11 parasites, Clonorchis sinensis, Taenia solium, Enterobius vermicularis, Ascaris lumbricoides, Trichuris trichiura, Spirometra mansoni, Diphyllobothrium latum, Ancylostoma duodenale, Schistosoma japonicum, Paragonimus westermani and Fasciolopsis buski, were selected and divided into two groups, the training group and the testing group, and were microphotographed. The eigenvalue was extracted using the VC++-based method. The eigenvalue database was constructed, and the training data set was tested with a variety of classification algorithms. The classifier was constructed using algorithm with the highest efficiency and an identification method was established by multi-feature fusion. Results After removal of images with invalid values, the training group received 19 844 egg images, and the testing group, 3 721 images. Based on the 14 eigenvalues, there were significant differences in the size and color among the eggs of 11 parasite species. For example, the length, width, area and brightness of the smallest parasite egg of Clonorchis sinensis were 292.24 μm, 192.64 μm, 43 416.61 μm2 and 53.84, respectively, while those of the largest parasite egg of Fasciolopsis buski were 945.31 μm, 610.88 μm, 536 002.60 μm2 and 100.54, respectively. When using dynamic weights to construct the classifier, the discrimination rate on the training data set was 88.89%(17 641/19 844), and that on the verification data set was 91.83%(3 004/3 271), with an average modeling time of 0.01 s. Conclusion The algorithm for egg classification has been established, which pravides a basis for further study on its feasibility.

Key words: Parasite eggs, Multiple features, Adaptive fusion, Classification algorithm