K.F. Wong obtained his PhD from Edinburgh University, Scotland, in 1987. After his PhD, he was researchers in Heriot-Watt University (Scotland), UniSys (Scotland) and ECRC (Germany). At present he is Associate Dean (External Affairs) of the Faculty of Engineering, Professor in the Department of Systems Engineering and Engineering Management, Director of Centre for Innovation and Technology (CINTEC), and Associate Director, Centre for Entrepreneurship, The Chinese University of Hong Kong (CUHK). He is also Honorary Professor, Harbin Institute of Technology (Shenzhen Graduate School), Adjunct Professor, School of Computer Technology, Peking University and Adjunct Professor, Northeastern University, Shenyang China. He is fellows of HKITJC, HKIE, IET and BCS. KF is very active in professional activities in Hong Kong. He is the Founding Chairman of the Bridging Digital Divide in China (HK) Charity Foundation. Also, he serves as the Chairman, ICT Advisory Committee, HK Scout Association; Chairman, Organisation Committee, 2012 & 2013 HK ICT Awards, Member of the Councumer Council and Non-official Member, Digital 21 Strategy Advisory Committee, HKSAR. He was awarded the Medal of Honour (MH) by the HKSAR Government in 2011 for his contribution in IT development in Hong Kong.
Talk Title: NLP for Microblog Summarization
Massive volume of textual information, eg news, over the Internet results in the problem of information explosion. Direct reading of massive text is impractical if not incomprehensible. For this reason, there is a growing demand for automatic summarization technology in the digital era. Traditionally, the goal of automatic summarization is to identify relevant excerpts from the target document(s), digest them and represent them in a succinct form for easy reading. While research in summarization has been in good progress in past decades, its effectiveness has been proved extremely low for social media applications, such as microblogging in twitter, WeChat, etc. This is mainly due to word limitation of microblog messages, which in turn leads to lack of proper grammatical structures and discourse information. We investigate how conventional NLP techniques can be best applied to summarizing microblogs. Resembling a document in a traditional natural language text, we first cluster related microblogs to form a microblog document (m-doc). Within a m-doc, microblog paragraphs (m-para) are formed by the leading topics. Practically, a m-para is a microblog re-post tree. Each m-para is comprised of inter-related short microblog messages ('sentences'), which is referred to as m-sent. Also, links between m-sent and m-para are viewed as contextual information. In our research, we propose a new discourse integration technique over m-doc for microblog summarization purpose.
Hang Li is director of the Noah’s Ark Lab of Huawei Technologies, adjunct professors of Peking University and Nanjing University. He is ACM Distinguished Scientist. His research areas include information retrieval, natural language processing, statistical machine learning, and data mining. Hang graduated from Kyoto University in 1988 and earned his PhD from the University of Tokyo in 1998. He worked at the NEC lab as researcher during 1991 and 2001, and Microsoft Research Asia as senior researcher and research manager during 2001 and 2012. He joined Huawei Technologies in 2012. Hang has published three technical books, and more than 120 technical papers at top international conferences including SIGIR, WWW, WSDM, ACL, EMNLP, ICML, NIPS, SIGKDD, AAAI, IJCAI, and top international journals including CL, NLE, JMLR, TOIS, IRJ, IPM, TKDE, TWEB, TIST. He and his colleagues’ papers received the SIGKDD’08 best application paper award, the SIGIR’08 best student paper award, the ACL’12 best student paper award. Hang worked on the development of several products such as Microsoft SQL Server 2005, Office 2007, Live Search 2008, Bing 2009, Office 2010, Bing 2010, Office 2012, Huawei Smartphones 2014. He has 42 granted US patents. Hang is also very active in the research communities and has served or is serving top international conferences as PC chair, Senior PC member, or PC member, including SIGIR, WWW, WSDM, ACL, NACL, EMNLP, NIPS, SIGKDD, ICDM, IJCAI, ACML, and top international journals as associate editor or editorial board member, including CL, IRJ, TIST, JASIST, JCST.
Talk Title: Will Question Answering Become the Main Theme of IR Research?
Abstract: In the era of mobile internet, question answering (QA) is becoming a more popular way for users to perform information access. For example, users want to get information more naturally and quickly through question answering on mobile phones; users in a specific domain want to get more precise and helpful advice in their decision making through question answering from knowledge base in the domain. Traditionally, a QA system is built upon search technologies. I argue in this talk that this is not sufficient for meeting the needs of users with regard to QA at present and in the future. I discuss several important yet unsolved QA problems for research in the field of IR, including Generative QA, Robust QA, and Interactive QA. I will also introduce some of our work on QA conducted at Huawei Noah Ark Lab.
Emine Yilmaz is an associate professor at University College London, Department of Computer Science, where she is a faculty fellow of the Alan Turing Institute. She also works as a research consultant for Microsoft Research Cambridge. She is the recipient of the Karen Sparck Jones 2015 Award for the contributions of her research to the field of information retrieval, and she received the Google Faculty Research Award in 2014/2015. Yilmaz's current research interests include information retrieval, data mining and applications of information theory, statistics and machine learning. She has published research papers extensively at major venues such as ACM SIGIR, CIKM and WSDM, gave several tutorials as part of top conferences, and organized various different workshops. Her research has been widely adopted by the information retrieval community and the sampling methods she has designed for efficient retrieval evaluation have been commonly used and made publicly available by the Text Retrieval Conference (TREC), funded by the National Institute of Standards and Technology (NIST). She is currently serving as the PC Chair for ACM SIGIR 2018 and ACM ICTIR 2017, Practice and Experience Chair for ACM WSDM 2017, and as the Doctoral Consortium Chair for ECIR 2017. She is an elected member of the executive committee of ACM SIGIR and an organizing committee member of the British Computing Society Information Retrieval Specialist Group.
Talk Title: New Ways of Thinking about Search with New Devices
Abstract: With the introduction of new types of devices in our everyday lives (e.g. smart phones, smart watches, smart glasses, etc.), the interfaces over which IR systems are used are becoming increasingly smaller, which limits the interactions users may have. Searching over devices with such small interfaces is not easy as it requires more effort to type and interact with such systems. Hence, building IR systems that can reduce the interactions needed with the device is highly critical. Design, optimization and evaluation of retrieval systems has traditionally focused on identifying and retrieving documents relevant to a query submitted by the user. However, with the new devices over which search engines are used for, effort to find relevant information plays a significant role for user satisfaction. In this talk, I will first argue that effort to find relevant information in a document can have a significant impact on user satisfaction, arguing that more research should be put into devising retrieval methods that aim at minimizing user effort, given a query. Ideally, a search engine should be able to understand the reason that caused the user to submit a query and it should help the user achieve the actual task by guiding her through the steps (or subtasks) that need to be completed. Devising such task based information retrieval systems have several challenges that have to be tackled. In the second part of this talk, I will focus on the problems that need to be solved when designing such systems, as well as the progress that we have made in these areas.