英语人体隐喻的自动识别模型构建

发布时间：2018-03-07 13:20

本文选题：概念隐喻理论　切入点：人体隐喻　出处：《北京外国语大学》2016年博士论文　论文类型：学位论文

【摘要】：隐喻自动识别研究开始于20世纪八十年代,研究方法由起初基于规则构建知识库的方法,过渡到基于统计的机器学习方法。隐喻识别的准确率虽然得到了提升,但是研究多见于计算语言学领域,学者们聚焦于统计算法的优化,对隐喻的语言学特征探究较少,隐喻识别的准确度仍有较大提升空间。本研究以英语人体隐喻为研究对象,在分析人体隐喻语言特征的基础上构建人体隐喻自动识别模型,目的在于探索隐喻自动识别的优化途径。本研究依据概念隐喻理论对人体隐喻类型展开细致分类,同时借鉴已有人体隐喻研究在语言层面的发现,以语言学思想为指引构建隐喻自动识别模型。研究重点回答以下两个研究问题：1)英语人体隐喻语义分布及语言表达形式有哪些基本特征?2)如何应用人体隐喻语言特征构建人体隐喻自动识别模型?模型识别效果如何?本研究大致分为以下三个阶段：(1)语料收集与隐喻人工标注。本研究从WordNet知识库中选取49个人体域代表性词语,然后从BNC语料库中随机抽取3000个包含人体词的句子,对其中涉及的人体隐喻基本类型进行了划分和人工标注；(2)对人工标注语料展开语言特征分析。研究中对人体词的语义分布情况进行了分析,并对各类型人体隐喻对应的主要语言表达形式进行了概括和总结；(3)人体隐喻自动识别模型构建与验证。针对人体隐喻的不同语言表达形式,本研究构建的隐喻自动识别模型包括两个模块,即人体隐喻语言特征知识库模块与机器学习模块,最后使用验证集数据对模型的性能进行了分析。本研究的主要发现包括以下三点：(1)常规隐喻是人体隐喻的最常见类型,以固定结构与半固定结构用法为主。(2)人体隐喻语言特征知识库可以显著提高机器学习的召回率,提升隐喻自动识别模型的性能。(3)基于语言特征知识库和机器学习的隐喻自动识别模具具有较高的性能,模型识别的精准率达到0.984,召回率达到0.755,F值达到0.854。与前人构建的模型相比,本研究中构建的人体隐喻自动识别模型效果有一定程度提高,且在稳定性与适用性上有较大保证。
[Abstract]:The study of automatic metaphor recognition began in 1980s, and the research method was changed from rule-based knowledge base to statistically based machine learning. The accuracy of metaphor recognition has been improved. However, most of the studies are in the field of computational linguistics. Scholars focus on the optimization of statistical algorithms, and there is little research on the linguistic features of metaphor, and there is still much room for improvement in the accuracy of metaphor recognition. Based on the analysis of the linguistic features of human body metaphor, a human body metaphor automatic recognition model is constructed in order to explore the optimal approach to the automatic recognition of human metaphors. The present study classifies the types of human metaphors in detail according to the conceptual metaphor theory. At the same time, it draws lessons from the findings of human metaphor studies at the linguistic level. The research focuses on answering the following two research questions: 1) what are the basic features of the semantic distribution and linguistic expressions of human body metaphors in English? 2) how to use the language features of human body metaphor to construct the automatic recognition model of human body metaphor? What is the effect of model recognition? This study is divided into the following three stages: 1) data collection and metaphorical tagging. In this study, 49 human domain representative words were selected from the WordNet knowledge base, and then 3000 sentences containing human words were randomly selected from the BNC corpus. In this paper, the basic types of human metaphors are divided, and the linguistic features of the human tagged corpus are analyzed. The semantic distribution of human words is analyzed. The main linguistic expressions corresponding to various types of human metaphors are summarized and summarized. (3) the automatic recognition model of human body metaphors is constructed and verified. Different linguistic expressions of human metaphors are discussed. The model of automatic metaphor recognition is composed of two modules, namely, the knowledge base module of human metaphorical language features and the machine learning module. Finally, the performance of the model is analyzed using validation set data. The main findings of this study include the following three points: 1) conventional metaphors are the most common types of human metaphors. The knowledge base of human metaphorical language features can significantly improve the recall rate of machine learning. Improving the performance of automatic metaphor recognition model. (3) automatic metaphor recognition mould based on language feature knowledge base and machine learning has high performance. The accuracy rate of model recognition is 0.984, recall rate is 0.755F is 0.854. The effect of the human metaphor recognition model constructed in this study is improved to a certain extent, and the stability and applicability of the model are guaranteed.
【学位授予单位】：北京外国语大学
【学位级别】：博士
【学位授予年份】：2016
【分类号】：H315

【参考文献】