Kyoto University Presentation

MALE SPEAKER: Welcome everyone. We're very pleased today to have professors and students from Kyoto University in Japan here, visiting Google. We're going to have professor Katsumi Tanaka give us an overview of the Department of Social Informatics Graduate School of Informatics Research and then we will have Professor Turo Ishida give us an overview of the research at the Laboratory for Global Information Network Department at the School of Social Informatics. Professor Tanaka. KATSUMI TANAKA: My name is Katsumi Tanaka, University of Kyoto. And thank you very much. You kindly accepted our visit to Google. And today I'd like to briefly introduce the research to in an area of research with my lab. And first of all, this is the list of students and professors and also researchers from Japan. And I came from Japan, Kyoto University, and Professor Ishida also came from Kyoto Universities and we have some researchers from Japanese Government Research Laboratory, NICT. And also there are many graduate students at Kyoto University here. So I hope somebody in Google is watching me. And my topic is Towards Next-Generation Search Engines and Browsers. So this is my image of Next-Generation Search Engine. [UNINTELLIGIBLE], the next image of web search engine will be a Search beyond Media Types and Places. You can see that this horizontal axis, this means the place of content storage. And the vertical axis, this axis is other media types of contents. So now we have a text search engine, and also video search engine for video searches and Google is already exploring several types of content searches so that you are now going up, not only fixed data, but also image retrieval and video retrieval. So this is one direction of conventional web searches, and this horizontal axis. This is very interesting, I mean. Currently, web search engines cover of course, web content and also desktop contents, and maybe now conventional web search engine is going from left to right, slightly, because some search engine is now able to recover not only personal content on the desktop, but also external database or encyclopedia, or GIS content. Now we are interested in this area. I mean, maybe, recently television had disk or DVD recorder. They can store many contents, right? So may be sometime, we will need some search engine to search the contents stored in hard disk or DVD records. And even the search engine for video content or digital camera, right? So this is another axis. So then in order to imagine the next generation web search, maybe we will need integrated search technology. And [INAUDIBLE PHRASE]. Very fortunately, web content has hyperlinks, and so Google was successful to keep a very new ranking algorithm, called [UNINTELLIGIBLE]. But unfortunately, you could not put program content in hard disk record. They don't have any hyperlinks. So then we need some new non-link-based mechanism. And so now I will briefly sketch an overview of our research activities in the area of search engines, or search technologies. And one direction is the integrated search. And this slide shows this is a TV program. And while watching TV program, you can automatically stream the related web page in real time. And I hear that in Google research lab, maybe some researcher is also engaged in this kind of research. So this is one example of integrated search. I mean, while you can watch the TV program, you can automatically stream related web page at the same time. This is another example of integrated search. This is a combination of GIS contents and web contents. And this system would automatically detect some landmark places from web contents. That is based on data mining technology. And then landmark place is a very famous places in town or many areas. And furthermore, based on those landmark places, the system automatically retrieves web pages concerned with those landmark places. So this is another example of combination of GIS contents and the web contents. The third one is a number of image search, and I hope some people in Google listen to this story. This is an example of Google image search. And Google image search now, the precision ratio of image search is very good, pretty good. But the recall ratio of Google image search is not so powerful, is not so good. And this one example. So you can try it later. By using Google image search engine, why not input just three keywords, Mount Fuji, and sunset, and snow? So then, the answer of Google image search is zero. I mean, the number of hits is zero. So then can you believe that there is no images concerned with Mount Fuji, and sunset and snow in the Worldwide Web big information space? The answer is no. There are many, many images concerned with Mount Fuji, and sunset, and snow. So then, how do you improve the recall ratio of your image search engine? This is basically a simple idea. Our idea is this. We will relax this query keyword. And so then we will select Mount Fuji and sunset. Only these two keywords is input through Google image search. And the third keyword is input through ordinary text search-- I mean web Google search. And then take intersection of these two results, so then you can find many, many relevant images concerned with Mount Fuji and sunset and snow. So then you can discover many, many relevant images of these three. This is our idea to improve your Google image search engine, especially to improve the recall ratio. And this is another example of integrated search. We have already developed a search engine to retrieve not only web content, but also TV program contents. And this is the one example. Now, you can see the image, the video image of our search engine, which tries to retrieve not only web pages, but also TV program contents for your query. Now, sorry, this is Japanese, but the user has now input the keyword space and space shuttle. So that this is the answer. And you can see, this is a web page. This is a web page. But here, this is a TV program content stored in your hard disk record. So then, once you input your keyword query, you can retrieve a web page. Furthermore, you can retrieve a TV content. And also you can browse the returned answer. Now the user is now focusing on this answer. This is the TV program content and if you zoom up, then you can see the details of the TV program content. And also, very recently young people in Japan, while they are watching TV, they are also using PC and internet. And especially, they are communicating by online chat systems. So this is another example of integrated search, which tries to integrate not only TV program content, but also online chat information. This is also a very brief, introductory video of this system. And this is the usual TV program. And, maybe, this is the user interface. If you zoom down, zoom out, then you can see not only the TV program content, but also hear the closed-captioned data and furthermore, hear the online chat information concerned with this portion of video program. So this is another example of integrated search of TV program content and online chat information. And this is also another example of integrated search. We already explored-- developed some browser, which can concurrently browse multiple websites in the concurrent one. Supposing that you are reading newspapers. So then you can, in this system, you can read multiple web news sites concurrently. And if you picked up some of your favorite news, then the related news article from other news site is ready to read. So this window is both vertically synchronized with this window. This is an image, I mean, on your desktop, you can put two newspapers, and then compare the related news articles. OK, and this is the ranking algorithm. Currently, even page rank, page rank algorithm is very new and a very good algorithm to run each web page. Our idea is slightly different. Why do you run web pages page by page? Our idea is to run a collection of pages. I mean, we made page peers from answer pages of Unix for ranking. So for Unix, ranking is not each single page, but peer of pages. Here's some examples. Suppose that you are querying UC Berkeley and Stanford. This means maybe you want to compare UC Berkeley and Stanford So just input the keyword, UC Berkeley and Stanford. So then Google search engine will return these pages and the candidates of the answer. And some page, there is a bunch more description about UC Berkeley, but very few descriptions about Stanford. And here this page-- this page describes many descriptions about UC Berkeley. but very few of Stanford. This page describes Stanford very much, and very few descriptions about UC Berkeley. But maybe the intention of this query is the user wishes to compare UC Berkeley and Stanford. So if I wanted to make a pair of this pages. So that is our idea, making a pair of pages by this and this, right? And this pair of page may be much more related to your query than each single page. That is the idea of granting page collection, not page by page. So then I almost talked about-- this is a quick review of our research activity. And also we explored some new type of browser, especially browsing multimedia content. And usually, everything we are now using Internet Explore, that is, a web browser. And also now everyday, we watch TV. This isn't very unique. But our idea is a reverse. Why not watch web? Why not browse TV. In order to do that. OK, so some type of media composite technology is needed. I mean, transform web page into TV program, right or transform TV program content into web-like place. This is very-- watching or listening-- this is very passive [UNINTELLIGIBLE]. And browsing is very active. Suppose, then in Explorer, everyday we click, and we explore up and down, and we read the text of the web page. But sometimes, we are tired, right? Maybe sometime, we can wish to just watch a video. So then, conventional Internet Explorer interface, it is a very active interface. So sometimes we may wish to have very passive [UNINTELLIGIBLE] the internet. Here's some example. OK, this example is in the no voice, so I, but this one is example of transforming some news article web page into TV program- like content. These funny characters are speaking to each other. The topic is from news article, or some website. So then users can just watch or listen, something that is [? pre-recorded. ?] This is one example. If somebody can understand Japanese, this is a very funny dialogue. But anyway, this is in Japanese. No voice. And maybe this technology can be used or provided from some PDA. Because mobile phones and PDA, both with their small screen, would not in this case be so active. I mean, [UNINTELLIGIBLE] scroll up and down. So then, maybe much more passive model to read web page may be necessary so that this kind of technology can be used. And this is the reversing. Converting TV program contents into web-like pages. [INAUDIBLE PHRASE]. This is a very short demo, but this program is running on my PC now. And this is very old news from TV. You know who he is. So this a news program. But now, supposed you are tired. You're avoiding to see this news. So you want to search another news, store the hard disk recording. But your disk record is huge. So each record may be one year old. And TV programs are not one year, but one week. So then, I want to search, quickly search some TV news stored in my hard disk record. So then our interface is this. Our interface would automatically try to convert this TV program into web-like pages, just zooming out. It can zoom out. Then, this TV program is automatically transformed into something like a web page. And this takes a [INAUDIBLE]. This takes instructions from closed captions of TV program. And furthermore, some hyperlinks are being generated. So then you can quickly browse the whole content of your hard disk record. And you can select your favorite news article. And then again, you can [INAUDIBLE PHRASE]. Finally, we are now interested in the trust of web search. We have just started this research. The program is to what extent can we believe the result of web search engine right? Even if you are a Google image search engine, or a google search engine, where a very good ranking algorithm is existing. But then to what extent can we believe that the top one page is best or not. That is the problem of trust of web search. And in order to consider the trust of web search engine, maybe we should consider three items. One is concerned with content research. I mean, the search page, so how does a search page offer fair information? In order to analyze this, of course, we can use several data mining technologies. And the second one, second axis is the social acceptance. I man, if you have some web page or a search result where I could then-- how do people evaluate the web page. This is not so new. I mean, Google page rank algorithm. This is one way to represent the degree of social acceptance of web pages. But maybe we may be able to explore other technologies in order to consider the trust of search results. And the third axis is also reliability. This means that if you are given some web page that is top-ranked for your query, but you cannot know how it also collaborates to create a web page or bookmarks. So then, maybe some technology is needed to get, how can we trust the author of the web page? This is very important. I have no time. So today I have some materials, I mean my papers, so much more detail. If you are interested, you can see me. So anyway. This is my conclusion. In order to imagine the next generation search engine, I pin down on several issues, especially Towards the Next Generation maybe close to video search. Beyond media types. This is very important, I think So then, maybe the browsing style may be different, or it may become much more of a variety. And furthermore trust of search will be much more important. Then in order to realize this kind of image, I also pin done some basic technologies we are now exploring. OK, so thank you very much for coming here. Thank you. MALE VOICE: I'd now like to introduce professor Toru Ishida from the Department of Social Informatics of Kyoto university who's going to talk to us about language grid, which is an infrastructure for intercultural collaboration. Professor. TORU ISHIDA: Good afternoon. I'm Toru Ishida from Kyoto University. And I'd like to talk about two Monday such plan it's not there is such a result but the plan for a language grid, an infrastructure for intercultural collaboration. So let me start with my motivation. So we had a so-called intercultural collaboration experiment from 2002, and so I want to talk about why I want to start a language grid project. And then I will talk about language grid architecture, including language service ontology, and language web services. And I went to work with various NPOs in Japan including hospital support for foreign patients, universal playgrounds for kids around the world, and marketing a radio program for disaster management, So we will work on this project with three NPO's. OK, here is motivation. The question is do we really share information on the web. It seems to us no standard language is on the internet now. This is online language population survey in September 2004, from this survey, it seems English population is 35%. When you look in languages, population is more than 29% and Asian languages is more than 26%. So we have to learn a lot of languages to understand where the information. But because it is impossible, we want to try to use a machine translation. And if we use a machine translation, we get such a result. If human translator translates, don't worry, it's nothing, Then, machine translator may say, not caring, trivial problems. So, the question is what happens when we use machine translation in intercultural collaboration. And we created that experiment in 2002 and are still continuing this experiment also in 2005. And so the experiment in 2002 is to develop open source software in Asian countries in our first languages. And five universities from Asia joined in this project, including Shanghai Jiao Tong University, Seoul National University, Handon University, University of Malaysia, and Kyoto University. And in this experiment, team members never meet in person, but complete software with multilingual communication tools, like web and VPS with machine translations. We did a fairly long exponent from April 2002 to December 2002. And language services were available, but we had a hard time to organize, to create the language services for this exponent. This so-called translation pentagon, we used in that 2002. We used five languages, Japanese, English, Chinese, Korean, and Malay, and we need a machine translator rate to cover those languages. There were a lot of questions there. How can we collect translation engines to cover all five languages. How can we understand their contract. The contracts differ and sometimes it's very hard for us to understand. An how do we evaluate their services? There is no quality assurance in machine translation. How much should we pay for covering the five languages? It's a lot. So, usually a million yen for each language pair. And how can we customize provided services? And then we decided to start the project called language grid. We believe that language is still the biggest problem in intercultural collaboration. Though English becomes a world standard language, people don't use it in local activities. The language barrier is serious, especially in Asia, because we are not taught our neighboring languages. So I mean that Japanese are not taught Chinese or Korean. And in China people are not taught Japanese or Korean and so on. And language services are often not accessible and usable. Only big organizations like Google can buy services and create their own services. But if some people in NPO's and universities want to create their own language services, we have a lot of difficulties to access and use those services. And so our goal is to create a language grid as an infrastructure of the internet. And we want to improve accessibility and usability of language services to create, to develop our own language services. So here is in language grid architecture, we have an two different goals. One is called horizontal language grid to provide standard language services worldwide, to create composite services by connecting existing language services this is upon a user's request. Added to the drop service ontology to standardize their interface. And another goal is vertical language grid to create community language services to support intercultural activities. So this is a language grid architecture. So we have a horizontal language grid connecting standard language services including WordNet. It's an English dictionary provided by Princeton University. Or EDR. This is provided by NICT in Japan, or Chinese dictionary or machine translations between the main languages And also we have a vertical language grid to support community activities including like medical support, interpretation support at local hospitals. So we will work in a language service ontology to standardize APIs of existing language services-- I mean that language services including the language resources and also language processing functions, like translation, paraphrasing, and so on. And we want to create community language services easily, by using those service ontologies. And we also work on the language web services. So now, as you know, the standardization is in progress for web services, and research in progress and semantic web services. Our goal is to collect a human agent collaboration to collect composite language services and we want to generate semantic wrapper for newly created language services. So here is language web service architecture, including three layers. The bottom layer is language service layer, including an ontonic component and composite component, and so on. The second layer is a scenario execution layer. So this is semantic web services, using DPOL, WSDO, UDDI, and OWL. And top layer is called scenario collaboration layer. And we will make a repository for service scenario and semantic wrappers. and we are now implementing language grid prototype hopefully by the end of March, this year. So let me introduce a few field studies we are planning with NPOs. So, we are planning to work with NPOs, especially three purposes. One is hospital support for foreign patients. The second one is universal playground for kids around the world. And a third one is marketing a radio program for disaster management. So let me quickly review the three cases. The first one is the medical interpretation services to local hospitals. And the name of the NPO is the Center for Multicultural Information and Assistance located in Kyoto. So this NPO started medical interpretation service from September 2003 to assist foreign patients. And at this moment, Chinese and Portuguese are highly needed. And in this case, translation should be very accurate. And machine translation is not useful for this purpose, because it's low quality, and so we need to use marketing our parallel texts. And they want to develop their own language resources also useful for local hospitals. So this is an image how we will use language grid for this purpose. So suppose, here is parallel marketing our parallel texts for medical use and NPO also wants to create marketing our parallel text for some local hospital use. And if we had similarity of evaluation program for two sentences, we can easily create composite services by using workflow to assist interpreter volunteers. The second case is the NPO, the name of the NPO is called Pangaea, and they want to create a universal playground for world kids. And the NPO was launched in Tokyo by researcher from MIT Media Labs, and activities are ongoing in Tokyo and Kyoto. Overseas branches are in Korea, and Kenya, and Australia, Austria, and so on. And they are now collecting pictograms drawn by kids around the world, and that means they are developing their own language resources so pictogram repositories, and grounding grounded on the WordNet. So this is the pictogram language and associates grounding on the WordNet, so they are trying to connect pictograms to the concept of the WordNet. So this example shows how pictograms are different in the different countries. So, for example, this one, this pictogram, the meaning is morning in Japan. But it seems that this picture doesn't mean morning in Kenya. In Kenya, the morning should be like this. So they started to ground the meaning of the pictogram using the WordNet. So this is a new language and associate created by an NPO> So we have in a language grid and our people can put their own language and associates on that language grid. Then suppose if we have a standard language services like a Japanese-Korean translation, Korean morphological analysis, and then we can-- again, we can easily create in the workflow to create new services like this. So suppose Japanese kids input a Japanese sentence here and then the smartprobe can translate it into a Japanese sentence with pictograms and then translate it into a Korean sentence with pictograms. So the important thing is this mechanism allowed NPO's to create their own language services by using their own language resources with standard language services. So the third case is-- the name of the NPO is Kyoto Community Radio. This NPO started in March 2003, the first FM radio station by an NPO in Japan. And the radio program production workshops are organized by foreign residents from May 2005 to make radio program in various languages and especially they want to gather and broadcast information in various languages in case of disaster like big earthquake. So here is an example to support very different peoples from different countries using a different language to collaborate on a radio program production, so they want to need some modelling backbone system. And again, so they can easily create using available standard language services, like Japanese English translation and English morphological analysis or Engish Hindi dictionary, and so on. The point is, again, they can create their own language services for their own purposes. OK, here is a summary. So I introduce an Architecture of the Language Grid to increase accessibility and usability of language services. It includes two different kinds of language grids, a horizontal one, for collecting a nation's standard languages, and a vertical one, for creating community language services. And we hope that the impact of the language grid is fairly big. Language services will not be created just by professionals but by local communities. I think we want to make a positive spillage of creation, usage, and standardization of language services. OK. Thank you very much. So here is contributors. This is fairly multidisciplinary work and so people are from different areas, including natural language processing, AI agents, and sociologists, and collaborations and so on. OK, That's enough. Thank you very much.