Realizing breakthroughs in the areas of AI and language processing
Data & Science Solutions Group
Yahoo! JAPAN Research
Language Processing & Machine Learning
Joined in 2013—Black Belt*
He joined Yahoo Japan Corporation with the desire to carry out research related to language processing and machine learning using the vast amount of data the company possesses for analysis. He is currently performing research with a focus on developing search function improvements and document summarization functions for a variety of services.
Researching language processing at Yahoo! JAPAN Research
I was already researching analysis methods for product review texts at my previous job at an electronics manufacturer, however it was with the desire to use the Yahoo! JAPAN's vast amounts of data in my research that I joined Yahoo Japan Corporation. Since then, I have been studying the core technologies of language processing and machine learning and their practical application in services while working at Yahoo! JAPAN Research. The environment here allows me the freedom to carry out research in areas I am interested in, and we are all also urged to write and present research papers. While my mission is to tackle leading-edge fields, I feel a great sense of satisfaction with the fact that the results of my research have been recognized when I present a paper of mine that was selected at a leading conference, such as the ACL*.
In addition, at times I can gain valuable experience from being involved in the actual development process. For example, I had a hand in a development project that equipped a conversational voice assistant app with a word-chain function based on a so-called gamification idea. By participating in this project, I was able to understand for myself that adding a word-chain module to regular apps increases user response and retention rates.
Driving language processing research utilizing data possessed by Yahoo! JAPAN: one of the largest collections in Japan
My main research theme is language processing, but I am currently doing cutting-edge research on whether search function improvements, ad click-through rate prediction and document summarization functions can be utilized in Yahoo! JAPAN's various services by mapping language data onto real coordinate spaces and then processing it statistically. I feel a renewed sense that an environment with a large amount of tasks and data available internally for research is truly a blessing to a researcher. At my previous job, there were hurdles to overcome before the research and development process: having to first start with data collection, or not being able to find a place to apply the technology developed. However, here I am now able to focus on research and development. As Yahoo! JAPAN has one of the largest collections of data in Japan, we can often get good performance even when applying simple models. My work becomes fun because of the satisfaction I feel when I get good results from my research, though a big part of this is due to the power of the data.
One thing I find interesting in my work is that while the text data used in language processing is made by humans, rules come into existence somewhere among that data when it is gathered together in large amounts, and one sees that even though this data is man-made, it is consistent with rules found in nature.
Utilizing Yahoo! JAPAN’s vast calculation resources
to make breakthroughs in language processing
AI will be an extremely important theme in the future. However, the field of language processing still has not had breakthroughs as big as those in image and speech processing. First, I want to take advantage of my position at Yahoo! JAPAN Research and achieve breakthroughs in specific tasks for services by utilizing the enormous amount of data and calculation resources that Yahoo! JAPAN has. The field of deep learning has been seeing progress, so we have recently been applying technology from it to our research in order to realize a service that automatically generates natural text. In the future I would like continue presenting my research, thereby increasing my presence in this field, and playing a part in driving its development.
I believe that in the future we will be able come closer to creating a more universal intelligence by also integrating and applying lifelog-type information, such as images, and sound and not limiting ourselves to just the world of language.
*Black belt: A system whereby the company supports the activities, both internal and external,
of employees who exhibit outstanding talent in a certain area, designating them as black belts.
*Annual Meeting of the Association for Computational Linguistics: one of the leading international language processing conferences
※ Information current as of April 2016.