当前位置: 首页 > 学术动态 > 正文
计算机科学与技术学院青年博士论坛系列讲座十九
阅读次数:     发布时间:2019-01-04

时间:201917日(周一)下午1500

地点:计算机科学与技术学院六楼会议室

 

题目 Linguistic Information: Concepts, Structures, and Quantitative Assessments

-- A Linguistic and Cognitive Approach to Natural Language Processing and Machine Learning

摘要: In this talk, I will discuss a linguistic and cognitive approach to solving certain fundamental issues in the fields of natural language processing and machine-learning with textual data. Specifically, in contrast to the currently prevailing methods that are mainly based on statistical approaches or mathematical operations on data as symbols, I will address the informational nature of textual contents as data that carry meaning and information.

    NLP as a subfield of Artificial Intelligence has had a long history with many attempts to make significant progress. However, the fundamental issues in the field have proven to be much more complicated than people originally expected. Over the years, mainstream NLP approaches gradually settled on using statistical methods for text classification and identifying information in “unstructured data”, after hand-written heuristics and rule-based systems failed to scale up to be generally applicable. But the advancements have not been significant due to the extremely challenging tasks in understanding natural languages.

    In this talk, I will present the major parts of a theoretical framework and implementation methods that I proposed for representing the relationships between language, knowledge, and information. The topics covered will include the concepts, structures, and quantitative measurements of what I call “linguistic information”. I will compare this framework with the modern information theory based on Claude Shannon’s ground-breaking work, and further extend Shannon’s basic concepts to natural language data and the process of linguistic communications.

   Demos will also be provided to show that more intelligent technologies and products can be built using the linguistic and cognitive approaches, in addition to the statistical approaches.

个人简介 Dr. Ronald (Guangsheng) Zhang conducted his first graduate study at Shanghai Jiao Tong University with a major in Applied Linguistic and Foreign Languages for Science and Technology. After that he was retained as an assistant professor at the same university. After coming to the United States, he earned his PhD in Linguistics from the University of Delaware. During these times, he developed deep insights into how natural languages encode and carry information, and how human brains comprehend language,together with a keen interest in building natural language-based technologies and products for solving practical problems, and for verifying theoretical hypoheses.

    In recent years, Dr. Zhang worked as the CTO and Chief Scientist at Linfo Research in Silicon Valley California, USA, and the Chief Scientist at Bello Intelligent Technologies Co. in Shenzhen, China. Dr. Zhang is the inventor of 32 issued patents, covering new technologies ranging from topic-modeling, intelligent search engines, information extraction from unstructured data, rule-based methods for high-accuracy sentiment analysis, unsupervised machine-learning methods for knowledge-discovery and knowledge representation, and information management methods and user interface functionalities for large amounts of unstructured data. Some of these patents are also being turned into academic papers.

作者:计算机科学与技术学院   审核:刘学军