Welcome to Journal of Kunming Metallurgy College! Today is Share:

JOURNAL OF KUNMING METALLURGY COLLEGE ›› 2015, Vol. 31 ›› Issue (5): 65-69.DOI: 10. 3969/j. issn. 1009—0479.2015.05.012

Previous Articles     Next Articles

The Application of NLTK Toolkit Based on Python in Corpus Research

LIU Xu   

  1. Department of Foreign Languages,Yunnan Normal University, Kunming 650500,China
  • Received:2015-09-01 Online:2015-11-30 Published:2015-11-30

Abstract:

According to the current domestic corpus based study,AntConc and PowerGREP are the mainresearch tool. Few studies were done using the Python language NLTK packet for data processing and analysis. It can not provide support to the research methods due to the design defect of the software. The Python language NLTK handling package was used in the study so that the data have uniform standards avoiding the conversion of various weakness of the range tool such as types of word processing workshop trouble. It also makes up for the syntactic analysis,graphic,regular expression search etc. In this paper,it was briefly introduced that the application of NLTK processing package based on Python in research. Then it takes the novel Emma written by Austen in Gutenberg corpus as an example to explain how to use the natural language processing to process the data.

Key words: Python, NLTK toolkit, corpus research

CLC Number: