logo资料库

The MeSH-gram Neural Network Model.pdf

第1页 / 共6页
第2页 / 共6页
第3页 / 共6页
第4页 / 共6页
第5页 / 共6页
第6页 / 共6页
资料共6页,全文预览结束
The MeSH-gram Neural Network Model: Extending Word Embedding Vectors with MeSH Concepts for UMLS Semantic Similarity and Relatedness in the Biomedical Domain Saïd Abdeddaïma, Sylvestre Vimarda, Lina F. Soualmiaa a Normandie Univ., UNIROUEN, UNIHAVRE, INSA Rouen, LITIS, F-76000, Rouen, France, Abstract Eliciting semantic similarity between concepts remains a challenging task. Recent approaches founded on embedding vectors have gained in popularity as they risen to efficiently capture semantic relationships. The underlying idea is that two words that have close meaning gather similar contexts. In this study, we propose a new neural network model named “MeSH-gram” which relies on a straightforward approach that extends the skip-gram neural network model by considering MeSH (Medical Subject Headings) descriptors instead words. Trained on publicly available PubMed/MEDLINE corpus, MesSH-gram is evaluated on reference standards manually annotated for semantic similarity. MeSH-gram is first compared to skip-gram with vectors of size 300 and at several windows’ contexts. A deeper comparison is performed with twenty existing models. All the obtained results of Spearman’s rank correlations between human scores and computed similarities show that MeSH-gram (i) outperforms the skip-gram model, and (ii) is comparable to the best methods but that need more computation and external resources. Introduction Eliciting semantic similarity and relatedness between concepts is a major issue in the biomedical domain. Different measures have been proposed the last decades [1]. Those measures quantify the degree to which two concepts are similar. They either rely on knowledge-based approaches using ontologies and terminologies, or corpus-based approaches which are founded on distributional statistics (e.g. literature-based drug discovery [2-5]). Several clinical applications of importance rely on semantic similarity and relatedness [6], such as biomedical information extraction and retrieval, clinical decision support, or disease prediction. For instance, biomedical information extraction and retrieval is improved by including semantically related terms and concepts [7-10]. The same approaches are used in the task of summarizing Electronic Health Records [11,12] and in document clustering [13]. The prediction of disease-causing genes and disease prediction from similar genes [14,15] rely on the identification of similar dideases [16], or genes [17]. Other applications include drug re-purposing [18,19] and drug interaction [20]. The recent approaches that have given better results in semantic similarity and relatedness measures are founded on word embedding vectors computed by neural networks. Indeed, such architectures implemented initially by word2vec [21], have gained in popularity in the biomedical domain as they risen to efficiently capture semantic similarity and relatedness relationships between words and concepts [22-27]. Word embeddings is based on neural network language modeling where words are mapped to fixed-dimension vectors of real numbers. The similarity between words can thus be measured by the (cosine) similarity between vectors that are constructed over a training corpus. All co-occurrences of a word and its neighbors (i.e. contexts) within a predefined window size are considered. The idea behind those representation learning approaches is that two words that have close meaning have generally similar contexts [28]. For example, the words “Epilepsy” and “Convulsion” will both have “Brain” and “Mind” as neighbors. word2vec developed by Mikolov et al.[21] is a neural network language model that learns word vectors that either maximizes the probability of a word given the surrounding context, referred to as the CBOW approach (Continuous Bag Of Words) approach, or to maximize the probability of the context given a word, referred to as the skip-gram approach. In this study we propose a new method, named “MeSH-gram”, which relies on a straightforward approach: it computes the word vectors by only using the MeSH (Medical Subject Headings) descriptors that are already included in the MEDLINE/PubMed corpus. The MeSH-gram model extends the skip-gram neural network model used in word2vec [21] and fastText tools [29]. fastText is a successful reimplementation of word2vec which is designed to compute the vector of each word using its neighbors. The extension we propose in the MeSH-gram model replaces the neighbors by the MeSH descriptors of the abstract where each word occurs Related Works Several semantic similarity and relatedness measures have been proposed the last decades [27]. Many of them have been implemented in the UMLS::Similarity package [30] avalaible in the UMLS (Unified Medical Language System). They differ on the method used: path-based, content-based, UMLS-based, corpus-based, and more recently, methods based on word vectors and concepts vectors. Path-based measures [7] use the hierarchical structure of a taxonomy to measure similarity: concepts close to each other are more similar. For instance, Sajadi et al. [31,32] developed a ranking algorithm based on Wikipedia graph metrics and used it to compare biomedical concepts. Content-based information measures [33,34] quantify the amount of information a concept provides: the more
specific concepts have a greater amount of information content. Other approaches [35,36] use the entire UMLS (Unified Medical Language System) Metathesaurus® [37] in order to compare the context in the definition of the concept to quantify its relatedness. Several methods are vector-based: the concepts are represented by vectors and the relatedness is usually estimated using the cosine similarity between them. In [36], the authors proposed to compute gloss vectors based second order co–occurrences trained on WordNet. In [38], the authors computed the cosine of two Latent Semantic Indexing concept vectors based on Pointwise Mutual Information association measure matrix. Recent vector-based methods use neural networks in order to compute concept vectors. The word2vec [21] tool was trained on different corpora: OSHUMED (by Sajadi et al. [32]), PubMed/MEDLINE (by Chui et al. [23]), PubMed Central (by Muneeb et al. [22], Chiu et al. [23], and Pakhomov et al. [24]), and CLINICAL-ALL by [24]. Following the approach used by De Vine et al. [39] on OSHUMED, Yu et al. [25] trained word2vec on Pubmed/MEDLINE transformed to UMLS concepts using the MetaMap indexing tool [40]. Other recent methods rely on word vectors. In their previous work, Yu et al. [41] retrofitted word vectors obtained by word2vec with hierarchichal information from the MeSH thesaurus. Recently, Henry et al. [26] compared different ways to combine word vectors in order to compute multi-word term vectors. The compared multi-word term aggregation method consists in the summation (avergaing) of component word vectors, creating concept vectors using the MetaMap indexing tool [40], and creating multi-word term vectors using the compoundify tool based on the UMLS Specialist Lexicon as glossary [xx]. More recently, Henry et al. [27] use association measures for estimating semantic similarity and relatedness between biomedical concepts on PubMed/MEDLINE transformed to UMLS concepts. The best performance results were obtained by [25-27]. Their respective approach relies either on MetaMap in order to transform the text corpus into UMLS concepts or on additional external ressources such as the Specialist Lexicon. The MeSH-gram model we propose in this study relies on a straightforward approach: it computes the word vectors by only using the MeSH descriptors that are already included in the PubMed/MEDLINE corpus. The extension we propose in the MeSH-gram model replaces the neighbors by the MeSH descriptors of the abstract where each word occurs. In order to evaluate MeSH-gram, we use publicly available manually annotated corpora: two subsets from Mayo Clinic (MiniMayoSRS) of the MayoSRS (Mayo Semantic Relatedness Set) developed by Pakhomov et al. [42], and two from UMNSRS (The University of Minnesota Semantic Relatedness Set) developed by Pakhomov et al. [43]. MeSH-gram results first compared to skip-gram and are then compared to twenty existing solutions reported in [27], including the best ones [25-27]. The MeSH-gram model has several advantages: (i) it avoids considering uninformative and too frequent words; (ii) there are less MeSH descriptors than possible context words; and (iii) MeSH descriptors are manually assigned and curated, which assures the best quality of indexing. Methods Neural network language models learn word vectors by either maximizing the probability of a word given the context, referred to as the CBOW (Continuous Bag Of Words) approach, or by maximizing the probability of the context given a word, referred to as the skip-gram approach. Skip-gram Word Embedding Model Given w1 w2 … wn a text line of words wi, the skip-gram model maximizes the following average log probability: 1 2r 2r ∑ i=1 ∑ −r≤ j≤r ,j≠0 log p wt+ j wt ( ) where wt is the target word, wt+j is the context, and r is the context window radius. The context words surrounding the target term are determined by the context window radius r. The probability of a context word wc given a target word wt, is computed by : ) ( exp Vwc TVwt ( exp Vw TVwt ∑ where N is the vocabulary size, and Vw represents the vector of the word w. MeSH-gram word embedding model ( p wc wt ) = w=1 N ) The MeSH-gram word-embedding model proposed in this paper extends the skip-gram neural network model used in word2vec [21] and fastText [29] tools: it uses MeSH descriptors that are already included in the PubMed/MEDLINE corpus to compute the word vectors. Given w1 w2 … wn the words of a PubMed/MEDLINE abstract, and m1 m2 … mk the MeSH descriptors associated to this abstract, the MeSH-gram model maximizes the following average log probability: where wt is the target word and mi is a MeSH descriptor. 1 k k ∑ i=1 log p mi wt ( )
The probability of a context MeSH descriptor mc given a target word wt, is computed by: ( p mc wt ) = ) ( exp Vmc TVwt ( TVwt exp Vm ∑ M m=1 ) where M is number of MeSH descriptors, and Vm represents the vector of the MeSH descriptor m. We have adapted fastText [29] in order to feed the neural network with pairs of word/MeSH descriptor. Vector Representation and Similarity Computation Using our MeSH-gram model and skip-gram model for comparison, we built word vectors of dimension 300. For the skip-gram model, we computed the vectors considering several window sizes W of 2, 5, 10 and 25. In order to quantify the relatedness of a pair of words, the cosine distance between the distributional context vectors of each word is used. In the case of a multi-word term, the vector is generated by computing the average of the component word vectors that compose the term. As an example, for the term “epilepsy attack”, the vector Vepilepsy_attack will be computed as Vepilepsy_attack=(Vepilepsy+Vattack)/2 where Vepilepsy and Vattack represent the vector of each word 3epilepsy and Attack respectively. Rather than combining word vectors after construction, multi-word term vectors may be constructed directly from a preprocessed training corpus in which multi-word terms have been identified [27] otherwise this will involve huge cost in preprocessing and storage requirements. Training Corpus We used PubMed/MEDLINE corpus that contains the abstracts of each article and the associated MeSH descriptors (URL: ftp://ftp.ncbi.nlm.nih.gov/pubmed/). The corpus was parsed with pubmed_parser, a python XML parser (URL: tokenized using polyglot https://github.com/titipata/pubmed_parser) (URL: https://github.com/aboSamoor/polyglot). For the skip-gram word embedding model using fastText, we prepared a file composed by concatenated abstracts, in which each abstract is splitted into one sentence per line. For the MeSH-gram word embedding model, we generated a file in which each line consists of an abstract with its MeSH descriptors. We have adapted fastText in order to read each line of this file and feed the neural network with pairs of (word, MeSH descriptor). for PubMed dataset. Each abstract was Gold Standard In order to compare the MeSH-gram word embedding model proposed in this study with other methods, we used two evaluation benchmarks: MiniMayoSRS [42] and UMNSRS [43]. MiniMayoSRS consists of 29 clinical term pairs. Two thirty pairs (66.67%) contain a multi-word term. The relatedness of each word pair is rated by medical coders and also by physicians. UMNSRS consists of 566 and 586 pairs medical pairs, for measuring similarity and relatedness respectively. The degree of association between terms in each dataset was rated by four medical residents from the University of Minnesota medical school. As suggested by Pakhomov et al. [43], we use a subset of the ratings consisting of 401 pairs for the similarity set and 430 pairs for the relatedness set. Twenty (4.99%) and seventeen (3.95%) of the term pairs contain multi-word terms for the similarity and relatedness subsets respectively. All these clinical terms correspond to UMLS concepts included in the Metathesaurus®. The correlations between the generated relatedness scores and the human-assigned scores are calculated using Spearman’s rank. Results Skip-gram Model versus MeSH-gram Model The results of the experiments are in Table 1 in which a comparison is performed between the results obtained with skip-gram model and those obtained by the MeSH-gram model using our modified version of fastText according to the four gold standards: MiniMayoSRS rated by physicians (MiniMayoSRS phys.), MiniMayoSRS rated by medical coders (MiniMayoSRS cod.), UMNSRS for similarity (UMNSRS Sim.) and UMNSRS for relatedness (UMNSRS Rel.). Table 1 – Spearman’s rank correlations between human scores and computed similarities. W: window size; n: nb of pairs. MiniMayo UMNSRS Skip-gram W=02 W=05 W=10 W=25 MeSH-gram Phys. n=29 0.740 0.763 0.776 0.766 0.811 Cod. n=29 0.757 0.779 0.789 0.781 0.855 Sim. n=380 0.679 0.704 0.716 0.718 0.724* *n=387 Rel. n=397 0.529 0.576 0.589 0.608 0.643** **n=407
MeSH-gram Model compared to Previous Works Table 2 gathers the results obtained by the MeSH-gram model we developed and twenty previous works’ results. It allows a comparison between all the models and on the same gold standards (MiniMayoSRS and UMNSRS). Table 2 complete the table 12 given by Henry et al. [27]. Discussion As one can see in Table 1 for the skip-gram model, the more the window is extended, the more the results are improved on the UMNSRS gold standard. The best results of skip-gram are obtained with a window size W=10 for the MiniMayoSRS set. This suggests that word vectors are a better solution when we consider an important number of context words in the abstract. The best results are obtained with the MeSH-gram model that considers MeSH descriptors as context for each term, suggesting that MeSH descriptors catch the semantics of all the abstracts associated with it. We can conclude that taking MeSH descriptors instead of context words gives better results than considering a large window size: (i) bigger window size does not lead necessary to better results, and (ii) MeSH descriptors are fewer than context words (50 context words for window size W=25) leading also a reduced computation time. The deeper comparison results with tewenty methods displayed in Table 2 confirm that the MeSH-gram model give comparable results with best previous work methods on the four gold standard datasets. While the methods (1) and (2) rely on the translation of PudMed/MEDLINE text data into ULMS concepts, and methods (4) and (5) require additional steps or resources such as compoundify tool (4) and MetaMapped MEDLINE corpus (5), the MeSH-gram model uses only the raw text corpus as input. The best previous works’ results are obtained by the method (2) and then the method (3). However the method (2) is not recommended by the authors themselves as it uses concept expansion which requires additional computation cost without significantly increasing the performances for any dataset [27]. MeSH-gram is comparable to method (3) with better results on three datasets. All those results allow us to conclude that UMLS information used by the methods (1) to (5) is already contained in the MeSH descriptors available in the PudMed/MEDLINE corpus and used by the MeSH-gram model. Using MeSH descriptors as context is a good solution for datasets founded on UMLS concepts. However, MeSH-gram should be evaluated on other types of similarities such as BioSimVerb and BioSimLex [44]. Table 2 – Spearman’s rank correlations between human scores and computed similarities using MeSH-gram and previous works’methods. n: nb of pairs (inspired by [27]). MiniMayo UMNSRS (7) Yu et al. [25]; no lexicons -- -- MeSH-gram (1) Henry et al. [27]; recommended (2) Henry et al.[27]; not recommended (3) Henry et al. [26]; CBOW words (4) Henry et al. [26]; CBOW compounds (5) Henry et al. [26]; CBOW concepts (6) Yu et al. [25]; narrow +other relations (8) Yu et al. [41] (9) Sajadi et al. [32]; HITS similarity (10) Sajadi et al. [32]; (word2vec OSHUMED+UMLS) (11) Sajadi et al. [32] (word2vec on OSHUMED) (12) Chui et al. [23] (13) Pakhomov et al. [24] (14) Muneeb et al. [25] (15) Workman et al. [32] (16) Patawardhan and Pedersen [36] (17) Lin [34] (18) Resnik [33] (19) Rada et al. [7] (20) Lesk [35] Phys. 0.81 (n=29) 0.84 (n=29) 0.85 (n=29) 0.82 (n=29) 0.80 (n=29) 0.77 (n=29) -- 0.70 (n=25) 0.67 (n=29) -- -- -- -- -- 0.67 (n=29) 0.69 (n=25) 0.59 (n=29) 0.42 (n=26) 0.34 (n=26) 0.35 (n=26) 0.52 (n=29) Cod. 0.86 (n=29) 0.81 (n=29) 0.84 (n=29) 0.82 (n=29) 0.78 (n=28) 0.83 (n=29) -- 0.67 (n=25) 0.72 (n=29) -- -- -- -- -- -- Sim. 0.72 (n=387) 0.69 (n=392) 0.73 (n=392) 0.69 (n=374) 0.70 (n=373) 0.73 (n=388) 0.69 (n=526) 0.68 (n=418) 0.64 (n=526) 0.63 (n=418) Rel. 0.64 (n=407) 0.64 (n=418) 0.66 (n=418) 0.61 (n=396) 0.65 (n=393) 0.60 (n=413) 0.62 (n=543) 0.63 (n=427) 0.59 (n=543) 0.59 (n=427) 0.58 (n=566) 0.51 (n=587) 0.39 (n=566) 0.39 (n=587) 0.26 (n=566) 0.65 (n=n/a) 0.62 (n=449) 0.52 (n=462) 0.29 (n=587) 0.60 (n=n/a) 0.58 (n=458) 0.45 (n=465) -- -- -- -- 0.58 (n=29) 0.53 (n=26) 0.46 (n=26) 0.44 (n=26) 0.57 (n=29) 0.58 (n=387) 0.49 (n=340) 0.49 (n=340) 0.53 (n=340) 0.50 (n=387) 0.45 (n=412) 0.29 (n=360) 0.26 (n=360) 0.29 (n=360) 0.33 (n=412)
Conclusions In this paper, we proposed a new method, MeSH-gram, to create distributional word vectors using MeSH descriptors as word context. We evaluated our results on four standard evaluation datasets, MiniMayoSRS Physicians, MiniMayoSRS Coders, UMNSRS tagged for relatedness, and UMNSRS tagged for similarity, and compared it against skip-gram model as a baseline and previous methods. All the obtained results of Spearman’s rank correlations between human scores and computed similarities show that MeSH-gram (i) outperforms the skip-gram model, and (ii) is comparable to the best recent methods but that need more computation and aditional external resources. In our future works, we plan to include in MeSH-gram the MeSH qualifiers affiliated to the descriptors in order to have a more precise semantic meaning (e.g. cancer/complications is more precise than cancer). A second step is to use fastText subwords and the evaluation of MeSH-gram for other kinds of silmilarities such as BioSimVerb and BioSimLex. MeSH-gram may also be used in other languages than English, for instance in French bibliographic corpora such as CISMeF [45], as well as in annaotated electronic health records. References [1] C. Lofi, Measuring semantic similarity and relatedness with distributional and knowledge-based approaches, Information and Me- [2] M. Yetisgen-Yildiz and W. Pratt, Using statistical and knowledge-based approaches for literature-based discovery. J Biomed In- dia Technologies 10(3)(2015), 493-501. form, 39(6)(2006), 600–611. 8(11)(2009), 865–878. Intelligence (2015), 2575–2581. [3] P. Agarwal and D.B Searls, Can literature analysis identify innovation drivers in drug discovery? Nat Rev Drug Discov [4] F. Doshi-Velez et al. Graph-sparse LDA: a Topic Model with Structured Sparsity, In Proc. 29th AAAI Conference on Artificial [5] M.R. Nelson, H. Tipney, J.L. Painter, J. Shen, P. Nicoletti, Y. Shen, et al., The support of human genetic evidence for approved [6] B.T. McInnes and T. Pedersen, Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs, J drug indications. Nat Genet 47(8)(2015), 856–860. Biomed Inform 54 (2015), 329–336. [7] R. Rada, H. Mili, E. Bicknell, and M. Blettner, Development and application of a metric on semantic nets, IEEE SMC 19(1)(1989), 17–30. 42(2)(2009), 390–405. [8] T. Cohen and D. Widdows, Empirical distributional semantics: methods and biomedical applications, J Biomed Inform [9] A. Henriksson, H. Moen, M. Skeppstedt, V. Daudaravičius, and M. Duneld, Synonym extraction and abbreviation expansion with ensembles of semantic spaces, J Biomed Semantics 5(1)(2014), 6. [10] C. Kurtz, C.F. Beaulieu, S. Napel, and D.L. Rubin, A hierarchical knowledge-based approach for retrieving similar medical im- ages described with semantic annotations, J Biomed Inform 49 (2014), 227–244. [11] R. Pivovarov and N. Elhadad, Automated methods for the summarization of electronic health records, J Am Med Inform Assoc 22(5)(2015), 938–947. 651–661. 25(3) (2007), 309. 12(1)(2011), 56. Methods 13(1)(2017), 8. [12] H. Moen, L.-M. Peltonen, J. Heimonen, A. Airola, T. Pahikkala, T. Salakoski, and S. Salanterä, Comparison of automatic sum- marisation methods for clinical free text notes, Artif Intell Med 67 (2016), 25–37. [13] Y. Lin et al. A document clustering and ranking system for exploring MEDLINE citations, J Am Med Inform Assoc 14(5) (2007), [14] K. Lage et al., A human phenome-interactome network of protein complexes implicated in genetic disorders, Nat Biotechnol [15] X. Wu, Q. Liu, and R. Jiang, Align human interactome with phenome to identify causative genes and networks underlying dis- ease families, Bioinformatics 25(1)(2008), 98–104. [16] A.-L. Barabási, N. Gulbahce, and J. Loscalzo, Network medicine: a network-based approach to human disease, Nat Rev Genet [17] T.M. Beissinger and G. Morota, Medical Subject Heading (MeSH) annotations illuminate maize genetics and evolution, Plant [18] A. Gottlieb, G. Y. Stein, E. Ruppin, and R. Sharan, PREDICT: a method for inferring novel drug indications with application to personalized medicine, Mol Syst Biol 7(1)(2011), 496. [19] A.S. Brown and C.J. Patel, MeSHDD: Literature-based drug-drug similarity for drug repositioning, J Am Med Inform Assoc 24(3)(2017), 614–618. pected interactions, Sci Rep 8(1)(2018), 1612. [20] S. Yoo, K. Noh, M. Shin, J. Park, K.-H. Lee, H. Nam, and D. Lee, In silico profiling of systemic effects of drugs to predict unex- [21] T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, and J. Dean, Distributed Representations of Words and Phrases and their Com- positionality. In Advances in Neural Information Processing Systems (2013), 3111–3119. [22] T.H. Muneeb, S.K. Sahu, and A. Anand, Evaluating distributed word representations for capturing semantics of biomedical con- cepts In ACL-International Joint Conference on Natural Language Processing (2015), 158. [23] B. Chiu, G. Crichton, A. Korhonen, and S. Pyysalo, How to train good word embeddings for biomedical NLP, In Proc. of the 15th Workshop on BioNLP (2016), 166–174. [24] S.V. Pakhomov, G. Finley, R. McEwan, Y. Wang, and G.B. Melton, Corpus domain effects on distributional semantic modeling of medical terms. Bioinformatics 32(23) (2016), 3635–3644.
[25] Z. Yu, B.C. Wallace, T. Johnson, and T. Cohen, Retrofitting concept vector representations of medical concepts to improve esti- mates of semantic similarity and relatedness, Stud Health Technol Inform 245 (2017), 657–661. [26] S. Henry, C. Cuffy, and B.T. McInnes, Vector representations of multi-word terms for semantic relatedness, J Biomed Inform 77 (2018), 111–119. [27] S. Henry, A. McQuilkin, and B.T. McInnes, Association measures for estimating semantic similarity and relatedness between biomedical concepts, Artif Intell Med (2018) (in press). [28] Z.S. Harris, Distributional structure, Word 10(2–3) (1954), 146–162. [29] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, Enriching word vectors with subword information Transactions of the Asso- ciation for Computational Linguistics 5 (2017), 135-146. [30] B.T. McInnes, T. Pedersen, and S.V. Pakhomov, UMLS-Interface and UMLS-Similarity: open source software for measuring paths and semantic similarity, In AMIA Annual Symposium Proceedings (2009), 431. [31] A. Sajadi, Graph-based domain-specific semantic relatedness from Wikipedia, In Advances in Artificial Intelligence (2014), 381– Corresponding author : Lina F. Soualmia, PhD, HdR lina.soualmia@litislab.eu Normandie Université, CURIB (LITIS), 25 Rue Lucien Tesnière, 76130 Mont-Saint-Aignan, FRANCE +33 232 955 173 386. [32] A. Sajadi, E. Milios, V. Kešelj, and J.C. Janssen, Domain-specific semantic relatedness from Wikipedia structure: a case study in biomedical text, In International conference on intelligent text processing and computational linguistics (2015), 347–360. [33] P. Resnik, Using information content to evaluate semantic similarity in a taxonomy, In Proc. of the 14th international joint con- ference on Artificial intelligence-Volume 1 (1995), 448–453. [34] D. Lin, Using syntactic dependency as local context to resolve word sense ambiguity, In Proc. of the 8th conference on European chapter of the ACL (1997), 64–71. [35] M. Lesk, Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone, In Proc. of the 5th annual international conference on Systems documentation (1986), 24–26. [36] S. Patwardhan and T. Pedersen, Using WordNet-based context vectors to estimate the semantic relatedness of concepts. In Proc. of the Workshop on Making Sense of Sense: Bringing Psycholinguistics and Computational Linguistics Together (2006). [37] O. Bodenreider, The unified medical language system (UMLS): integrating biomedical terminology, Nucleic acids research 32(suppl_1)(2004), D267-D270. [38] T.E. Workman, G. Rosemblat, M. Fiszman, and T.C. Rindflesch, A literature-based assessment of concept pairs as a measure of semantic relatedness, In AMIA Annual Symposium Proceedings (2013), 1512. [39] L. De Vine, G. Zuccon, B. Koopman, L. Sitbon, and P. Bruza, Medical semantic similarity with a neural language model, In Proc. of the 23rd ACM international conference on information and knowledge management (2014), 1819-1822). [40] A.R. Aronson and F.M. Lang, An overview of MetaMap: historical perspective and recent advances, J Am Med Inform Assoc 17(3)(2010), 229-236. [41] Z. Yu, T. Cohen, B. Wallace, E. Bernstam, and T. Johnson, Retrofitting word vectors of MeSH terms to improve semantic simi- larity measures. In Proc. of the 7th International Workshop on Health Text Mining and Information Analysis (2016), 43-51. [42] S.V Pakhomov, T. Pedersen, B. McInnes, G.B. Melton, A. Ruggieri, and C.G. Chute, Towards a framework for developing se- mantic relatedness reference standards. Journal of Biomedical Informatics 44(2) (2011), 251–265 [43] S. Pakhomov, B. McInnes, T. Adam, Y. Liu, T. Pedersen, and G. Melton, Semantic similarity and relatedness between clinical terms: An experimental study, In Proc. of the American Medical Informatics Association Symposium (2010), 572–576. [44] B. Chiu, S. Pyysalo, I. Vulić, and A. Korhonen, Bio-SimVerb and Bio-SimLex: wide-coverage evaluation sets of word similarity in biomedicine, BMC bioinformatics 19(1)(2018), 33. [45] L.F. Soualmia and S.J. Darmoni, Combining different standards and different approaches for health information retrieval in a quality-controlled gateway, Int J Med Inform 74(2-4) (2005), 141-150.
分享到:
收藏