Tip:
Highlight text to annotate it
X
MALE SPEAKER: Welcome everyone.
We're very pleased today to have professors and students
from Kyoto University in Japan here, visiting Google.
We're going to have professor Katsumi Tanaka give us an
overview of the Department of Social Informatics Graduate
School of Informatics Research and then we will have
Professor Turo Ishida give us an overview of the research at
the Laboratory for Global Information Network Department
at the School of Social Informatics.
Professor Tanaka.
KATSUMI TANAKA: My name is Katsumi Tanaka,
University of Kyoto.
And thank you very much.
You kindly accepted our visit to Google.
And today I'd like to briefly introduce the research to in
an area of research with my lab.
And first of all, this is the list of students and
professors and also researchers from Japan.
And I came from Japan, Kyoto University, and Professor
Ishida also came from Kyoto Universities and we have some
researchers from Japanese Government Research
Laboratory, NICT.
And also there are many graduate students at Kyoto
University here.
So I hope somebody in Google is watching me.
And my topic is Towards Next-Generation Search Engines
and Browsers.
So this is my image of Next-Generation Search Engine.
[UNINTELLIGIBLE], the next image of web search engine
will be a Search beyond Media Types and Places.
You can see that this horizontal axis, this means
the place of content storage.
And the vertical axis, this axis is other
media types of contents.
So now we have a text search engine, and also video search
engine for video searches and Google is already exploring
several types of content searches so that you are now
going up, not only fixed data, but also image retrieval and
video retrieval.
So this is one direction of conventional web searches, and
this horizontal axis.
This is very interesting, I mean.
Currently, web search engines cover of course, web content
and also desktop contents, and maybe now conventional web
search engine is going from left to right, slightly,
because some search engine is now able to recover not only
personal content on the desktop, but also external
database or encyclopedia, or GIS content.
Now we are interested in this area.
I mean, maybe, recently television
had disk or DVD recorder.
They can store many contents, right?
So may be sometime, we will need some search engine to
search the contents stored in hard disk or DVD records.
And even the search engine for video content or digital
camera, right?
So this is another axis.
So then in order to imagine the next generation web
search, maybe we will need integrated search technology.
And [INAUDIBLE PHRASE].
Very fortunately, web content has hyperlinks, and so Google
was successful to keep a very new ranking algorithm, called
[UNINTELLIGIBLE].
But unfortunately, you could not put program content in
hard disk record.
They don't have any hyperlinks.
So then we need some new non-link-based mechanism.
And so now I will briefly sketch an overview of our
research activities in the area of search engines, or
search technologies.
And one direction is the integrated search.
And this slide shows this is a TV program.
And while watching TV program, you can automatically stream
the related web page in real time.
And I hear that in Google research lab, maybe some
researcher is also engaged in this kind of research.
So this is one example of integrated search.
I mean, while you can watch the TV program, you can
automatically stream related web page at the same time.
This is another example of integrated search.
This is a combination of GIS contents and web contents.
And this system would automatically detect some
landmark places from web contents.
That is based on data mining technology.
And then landmark place is a very famous places in town or
many areas.
And furthermore, based on those landmark places, the
system automatically retrieves web pages concerned with those
landmark places.
So this is another example of combination of GIS contents
and the web contents.
The third one is a number of image search, and I hope some
people in Google listen to this story.
This is an example of Google image search.
And Google image search now, the precision ratio of image
search is very good, pretty good.
But the recall ratio of Google image search is not so
powerful, is not so good.
And this one example.
So you can try it later.
By using Google image search engine, why not input just
three keywords, Mount Fuji, and sunset, and snow?
So then, the answer of Google image search is zero.
I mean, the number of hits is zero.
So then can you believe that there is no images concerned
with Mount Fuji, and sunset and snow in the Worldwide Web
big information space?
The answer is no.
There are many, many images concerned with Mount Fuji, and
sunset, and snow.
So then, how do you improve the recall ratio of your image
search engine?
This is basically a simple idea.
Our idea is this.
We will relax this query keyword.
And so then we will select Mount Fuji and sunset.
Only these two keywords is input through
Google image search.
And the third keyword is input through ordinary text search--
I mean web Google search.
And then take intersection of these two results, so then you
can find many, many relevant images concerned with Mount
Fuji and sunset and snow.
So then you can discover many, many relevant
images of these three.
This is our idea to improve your Google image search
engine, especially to improve the recall ratio.
And this is another example of integrated search.
We have already developed a search engine to retrieve not
only web content, but also TV program contents.
And this is the one example.
Now, you can see the image, the video image of our search
engine, which tries to retrieve not only web pages,
but also TV program contents for your query.
Now, sorry, this is Japanese, but the user has now input the
keyword space and space shuttle.
So that this is the answer.
And you can see, this is a web page.
This is a web page.
But here, this is a TV program content stored in your hard
disk record.
So then, once you input your keyword query, you can
retrieve a web page.
Furthermore, you can retrieve a TV content.
And also you can browse the returned answer.
Now the user is now focusing on this answer.
This is the TV program content and if you zoom up, then you
can see the details of the TV program content.
And also, very recently young people in Japan, while they
are watching TV, they are also using PC and internet.
And especially, they are communicating by online chat
systems. So this is another example of integrated search,
which tries to integrate not only TV program content, but
also online chat information.
This is also a very brief, introductory
video of this system.
And this is the usual TV program.
And, maybe, this is the user interface.
If you zoom down, zoom out, then you can see not only the
TV program content, but also hear the closed-captioned data
and furthermore, hear the online chat information
concerned with this portion of video program.
So this is another example of integrated search of TV
program content and online chat information.
And this is also another example of integrated search.
We already explored--
developed some browser, which can concurrently browse
multiple websites in the concurrent one.
Supposing that you are reading newspapers.
So then you can, in this system, you can read multiple
web news sites concurrently.
And if you picked up some of your favorite news, then the
related news article from other news
site is ready to read.
So this window is both vertically synchronized with
this window.
This is an image, I mean, on your desktop, you can put two
newspapers, and then compare the related news articles.
OK, and this is the ranking algorithm.
Currently, even page rank, page rank algorithm is very
new and a very good algorithm to run each web page.
Our idea is slightly different.
Why do you run web pages page by page?
Our idea is to run a collection of pages.
I mean, we made page peers from answer
pages of Unix for ranking.
So for Unix, ranking is not each single
page, but peer of pages.
Here's some examples.
Suppose that you are querying UC Berkeley and Stanford.
This means maybe you want to compare UC Berkeley and
Stanford So just input the keyword, UC
Berkeley and Stanford.
So then Google search engine will return these pages and
the candidates of the answer.
And some page, there is a bunch more description about
UC Berkeley, but very few descriptions about Stanford.
And here this page--
this page describes many
descriptions about UC Berkeley.
but very few of Stanford.
This page describes Stanford very much, and very few
descriptions about UC Berkeley.
But maybe the intention of this query is the user wishes
to compare UC Berkeley and Stanford.
So if I wanted to make a pair of this pages.
So that is our idea, making a pair of pages by this and
this, right?
And this pair of page may be much more related to your
query than each single page.
That is the idea of granting page
collection, not page by page.
So then I almost talked about--
this is a quick review of our research activity.
And also we explored some new type of browser, especially
browsing multimedia content.
And usually, everything we are now using Internet Explore,
that is, a web browser.
And also now everyday, we watch TV.
This isn't very unique.
But our idea is a reverse.
Why not watch web?
Why not browse TV. In order to do that.
OK, so some type of media composite
technology is needed.
I mean, transform web page into TV program, right or
transform TV program content into web-like place.
This is very--
watching or listening--
this is very passive [UNINTELLIGIBLE].
And browsing is very active.
Suppose, then in Explorer, everyday we click, and we
explore up and down, and we read the text of the web page.
But sometimes, we are tired, right?
Maybe sometime, we can wish to just watch a video.
So then, conventional Internet Explorer interface, it is a
very active interface.
So sometimes we may wish to have very passive
[UNINTELLIGIBLE]
the internet.
Here's some example.
OK, this example is in the no voice, so I, but this one is
example of transforming some news article web page into TV
program- like content.
These funny characters are speaking to each other.
The topic is from news article, or some website.
So then users can just watch or listen, something that is
[? pre-recorded. ?]
This is one example.
If somebody can understand Japanese, this is
a very funny dialogue.
But anyway, this is in Japanese.
No voice.
And maybe this technology can be used or
provided from some PDA.
Because mobile phones and PDA, both with their small screen,
would not in this case be so active.
I mean, [UNINTELLIGIBLE]
scroll up and down.
So then, maybe much more passive model to read web page
may be necessary so that this kind of
technology can be used.
And this is the reversing.
Converting TV program contents into web-like pages.
[INAUDIBLE PHRASE].
This is a very short demo, but this program is
running on my PC now.
And this is very old news from TV. You know who he is.
So this a news program.
But now, supposed you are tired.
You're avoiding to see this news.
So you want to search another news, store
the hard disk recording.
But your disk record is huge.
So each record may be one year old.
And TV programs are not one year, but one week.
So then, I want to search, quickly search some TV news
stored in my hard disk record.
So then our interface is this.
Our interface would automatically try to convert
this TV program into web-like pages, just zooming out.
It can zoom out.
Then, this TV program is automatically transformed into
something like a web page.
And this takes a [INAUDIBLE].
This takes instructions from closed captions of TV program.
And furthermore, some
hyperlinks are being generated.
So then you can quickly browse the whole content of your hard
disk record.
And you can select your favorite news article.
And then again, you can [INAUDIBLE PHRASE].
Finally, we are now interested in the trust of web search.
We have just started this research.
The program is to what extent can we believe the result of
web search engine right?
Even if you are a Google image search engine, or a google
search engine, where a very good ranking
algorithm is existing.
But then to what extent can we believe that the top one page
is best or not.
That is the problem of trust of web search.
And in order to consider the trust of web search engine,
maybe we should consider three items. One is concerned with
content research.
I mean, the search page, so how does a search page offer
fair information?
In order to analyze this, of course, we can use several
data mining technologies.
And the second one, second axis is the social acceptance.
I man, if you have some web page or a search result where
I could then-- how do people evaluate the web page.
This is not so new.
I mean, Google page rank algorithm.
This is one way to represent the degree of social
acceptance of web pages.
But maybe we may be able to explore other technologies in
order to consider the trust of search results.
And the third axis is also reliability.
This means that if you are given some web page that is
top-ranked for your query, but you cannot know how it also
collaborates to create a web page or bookmarks.
So then, maybe some technology is needed to get, how can we
trust the author of the web page?
This is very important.
I have no time.
So today I have some materials, I mean my papers,
so much more detail.
If you are interested, you can see me.
So anyway.
This is my conclusion.
In order to imagine the next generation search engine, I
pin down on several issues, especially Towards the Next
Generation maybe close to video search.
Beyond media types.
This is very important, I think So then, maybe the
browsing style may be different, or it may become
much more of a variety.
And furthermore trust of search will
be much more important.
Then in order to realize this kind of image, I also pin done
some basic technologies we are now exploring.
OK, so thank you very much for coming here.
Thank you.
MALE VOICE: I'd now like to introduce professor Toru
Ishida from the Department of Social Informatics of Kyoto
university who's going to talk to us about language grid,
which is an infrastructure for intercultural collaboration.
Professor.
TORU ISHIDA: Good afternoon.
I'm Toru Ishida from Kyoto University.
And I'd like to talk about two Monday such plan it's not
there is such a result but the plan for a language grid, an
infrastructure for intercultural collaboration.
So let me start with my motivation.
So we had a so-called intercultural collaboration
experiment from 2002, and so I want to talk about why I want
to start a language grid project.
And then I will talk about language grid architecture,
including language service ontology, and
language web services.
And I went to work with various NPOs in Japan
including hospital support for foreign patients, universal
playgrounds for kids around the world, and marketing a
radio program for disaster management,
So we will work on this project with three NPO's.
OK, here is motivation.
The question is do we really share information on the web.
It seems to us no standard language is on
the internet now.
This is online language population survey in September
2004, from this survey, it seems English
population is 35%.
When you look in languages, population is more than 29%
and Asian languages is more than 26%.
So we have to learn a lot of languages to understand where
the information.
But because it is impossible, we want to try to use a
machine translation.
And if we use a machine translation,
we get such a result.
If human translator translates, don't worry, it's
nothing, Then, machine translator may say, not
caring, trivial problems. So, the question is what happens
when we use machine translation in intercultural
collaboration.
And we created that experiment in 2002 and are still
continuing this experiment also in 2005.
And so the experiment in 2002 is to develop open source
software in Asian countries in our first languages.
And five universities from Asia joined in this project,
including Shanghai Jiao Tong University, Seoul National
University, Handon University, University of Malaysia, and
Kyoto University.
And in this experiment, team members never meet in person,
but complete software with multilingual communication
tools, like web and VPS with machine translations.
We did a fairly long exponent from April
2002 to December 2002.
And language services were available, but we had a hard
time to organize, to create the language
services for this exponent.
This so-called translation pentagon, we
used in that 2002.
We used five languages, Japanese, English, Chinese,
Korean, and Malay, and we need a machine translator rate to
cover those languages.
There were a lot of questions there.
How can we collect translation engines to
cover all five languages.
How can we understand their contract.
The contracts differ and sometimes it's very hard for
us to understand.
An how do we evaluate their services?
There is no quality assurance in machine translation.
How much should we pay for covering the five languages?
It's a lot.
So, usually a million yen for each language pair.
And how can we customize provided services?
And then we decided to start the project
called language grid.
We believe that language is still the biggest problem in
intercultural collaboration.
Though English becomes a world standard language, people
don't use it in local activities.
The language barrier is serious, especially in Asia,
because we are not taught our neighboring languages.
So I mean that Japanese are not taught Chinese or Korean.
And in China people are not taught Japanese or
Korean and so on.
And language services are often not
accessible and usable.
Only big organizations like Google can buy services and
create their own services.
But if some people in NPO's and universities want to
create their own language services, we have a lot of
difficulties to access and use those services.
And so our goal is to create a language grid as an
infrastructure of the internet.
And we want to improve accessibility and usability of
language services to create, to develop our
own language services.
So here is in language grid architecture, we have an two
different goals.
One is called horizontal language grid to provide
standard language services worldwide, to create composite
services by connecting existing language services
this is upon a user's request.
Added to the drop service ontology to
standardize their interface.
And another goal is vertical language grid to create
community language services to support intercultural
activities.
So this is a language grid architecture.
So we have a horizontal language grid connecting
standard language services including WordNet.
It's an English dictionary provided by Princeton
University.
Or EDR. This is provided by NICT in Japan, or Chinese
dictionary or machine translations between the main
languages
And also we have a vertical language grid to support
community activities including like medical support,
interpretation support at local hospitals.
So we will work in a language service ontology to
standardize APIs of existing language services--
I mean that language services including the language
resources and also language processing functions, like
translation, paraphrasing, and so on.
And we want to create community language services
easily, by using those service ontologies.
And we also work on the language web services.
So now, as you know, the standardization is in progress
for web services, and research in progress and
semantic web services.
Our goal is to collect a human agent collaboration to collect
composite language services and we want to generate
semantic wrapper for newly created language services.
So here is language web service architecture,
including three layers.
The bottom layer is language service layer, including an
ontonic component and composite
component, and so on.
The second layer is a scenario execution layer.
So this is semantic web services, using DPOL, WSDO,
UDDI, and OWL.
And top layer is called scenario collaboration layer.
And we will make a repository for service scenario and
semantic wrappers.
and we are now implementing language grid prototype
hopefully by the end of March, this year.
So let me introduce a few field studies we are
planning with NPOs.
So, we are planning to work with NPOs,
especially three purposes.
One is hospital support for foreign patients.
The second one is universal playground for
kids around the world.
And a third one is marketing a radio program for disaster
management.
So let me quickly review the three cases.
The first one is the medical interpretation services to
local hospitals.
And the name of the NPO is the Center for Multicultural
Information and Assistance located in Kyoto.
So this NPO started medical interpretation service from
September 2003 to assist foreign patients.
And at this moment, Chinese and
Portuguese are highly needed.
And in this case, translation should be very accurate.
And machine translation is not useful for this purpose,
because it's low quality, and so we need to use marketing
our parallel texts.
And they want to develop their own language resources also
useful for local hospitals.
So this is an image how we will use language grid for
this purpose.
So suppose, here is parallel marketing our parallel texts
for medical use and NPO also wants to create marketing our
parallel text for some local hospital use.
And if we had similarity of evaluation program for two
sentences, we can easily create composite services by
using workflow to assist interpreter volunteers.
The second case is the NPO, the name of the NPO is called
Pangaea, and they want to create a universal playground
for world kids.
And the NPO was launched in Tokyo by researcher from MIT
Media Labs, and activities are ongoing in Tokyo and Kyoto.
Overseas branches are in Korea, and Kenya, and
Australia, Austria, and so on.
And they are now collecting pictograms drawn by kids
around the world, and that means they are developing
their own language resources so pictogram repositories, and
grounding grounded on the WordNet.
So this is the pictogram language and associates
grounding on the WordNet, so they are trying to connect
pictograms to the concept of the WordNet.
So this example shows how pictograms are different in
the different countries.
So, for example, this one, this pictogram, the meaning is
morning in Japan.
But it seems that this picture doesn't mean morning in Kenya.
In Kenya, the morning should be like this.
So they started to ground the meaning of the pictogram using
the WordNet.
So this is a new language and associate created by an NPO>
So we have in a language grid and our people can put their
own language and associates on that language grid.
Then suppose if we have a standard language services
like a Japanese-Korean translation, Korean
morphological analysis, and then we can--
again, we can easily create in the workflow to create new
services like this.
So suppose Japanese kids input a Japanese sentence here and
then the smartprobe can translate it into a Japanese
sentence with pictograms and then translate it into a
Korean sentence with pictograms. So the important
thing is this mechanism allowed NPO's to create their
own language services by using their own language resources
with standard language services.
So the third case is--
the name of the NPO is Kyoto Community Radio.
This NPO started in March 2003, the first FM radio
station by an NPO in Japan.
And the radio program production workshops are
organized by foreign residents from May 2005 to make radio
program in various languages and especially they want to
gather and broadcast information in various
languages in case of disaster like big earthquake.
So here is an example to support very different peoples
from different countries using a different language to
collaborate on a radio program production, so they want to
need some modelling backbone system.
And again, so they can easily create using available
standard language services, like Japanese English
translation and English morphological analysis or
Engish Hindi dictionary, and so on.
The point is, again, they can create their own language
services for their own purposes.
OK, here is a summary.
So I introduce an Architecture of the Language Grid to
increase accessibility and
usability of language services.
It includes two different kinds of language grids, a
horizontal one, for collecting a nation's standard languages,
and a vertical one, for creating
community language services.
And we hope that the impact of the language
grid is fairly big.
Language services will not be created just by professionals
but by local communities.
I think we want to make a positive spillage of creation,
usage, and standardization of language services.
OK.
Thank you very much.
So here is contributors.
This is fairly multidisciplinary work and so
people are from different areas, including natural
language processing, AI agents, and sociologists, and
collaborations and so on.
OK, That's enough.
Thank you very much.