Tip:
Highlight text to annotate it
X
>>Manos: Hello, everyone. My name is Manos and I'm an engineer on the third QI team.
It is my honor to introduce my advisor and associate professor from Virginia Tech, Dr.
Manuel Perez-Quinones.
Today he'll be talking about his work in personal information management and how users' personal
information management practices can be leveraged by the organization for knowledge management
as well.
So this talk is also part of the perspective as speakers' series which attempts to impart
images by sharing their research and their life stories with everyone. All of us.
So without much further ado - Dr. Perez-Quinones. Thank you.
>>Dr. Perez-Quinones: And I'm not your advisor anymore. Come on. You graduated. We're colleagues.
[laughter]
Now having said that I have on my slides here Manos Dungare is my Ph.D. student. So I guess
I'm in the same boat.
So I want to thank Manos for inviting me and I want to thank the Google, the Hispanic Googler
Network. I have a hard time saying that. It's a group of Hispanics here that, that saw my
talk announced and wanted to co-sponsor it. And one of them is behind this screen here
hiding.
What I wanted to do is talk a little bit about personal information management research in
general. Then talk about a couple of studies.
Two particular studies that we've done in research and my apologizes to Manos for not
having picked his dissertation as one of the two.
And then move on to something that we're pushing the research into - how to do we sort of take
advantage of the personal behavior and practices as a way to capture knowledge management?
And that is work that is evolving now and it's starting and it's a little bit more speculatory
and open.
And then I want to close because the Hispanic group asked me to sort of give a brief life
history or something around along those lines. Words of wisdom of sort. So, I'll, at the
end, I'll say a couple of things on that.
This is the world we live in today. We have just huge amounts of information. Lots of
displays. Lots of digital information that comes to us in many different forms, and we
are clearly overloaded.
And the one statistics that I saw recently was that in 2005 there were 35 billion corporate
emails exchanged per day. And a lot of workers are reporting that they're just feeling overwhelmed
by just trying to manage the information.
The other thing that is happening is that we have a proliferation of devices. We have
handheld computers, phones, PDA's, game boxes and all of them have email, all of them have
Web access, all of them are connected, all of them access your calendar - and it's starting
to get to a problem where you - in some cases - you have to wonder: "Where did I put that
stuff? I wrote a note to myself on one of my machines someplace."
And I think that Cloud and some of the other things that are happening will alleviate that,
but the current situation today is that you still have to do a lot of syncing information.
These two things are things that we try to address in personal information management
research, and in particular my research group at Virginia Tech has been looking at a lot
of the handheld devices and small device work.
So PIM research is about studying how people find information, keep it organized and reuse
it down the road. It is very, very much targeted to the personal aspect of that. It's really
much targeted to "How do I find my emails? How do I organize my emails? How do I organize
my files?" Not the traditional IR type research of: "How do I issue a query that will result
in the right document?"
The research and the work has sort of been organized around this framework of finding,
keeping, organizing, and reusing information. So a lot of the studies have been finding
studies or making a keeping decision. What of the information that I've run across I
decide to keep or not.
And the reason to study is the previous two slides.. We are overwhelmed. We have a lot
of information demands. Information demands attention which means you get distracted quickly
from one thing and the other. That leads to a decrease in productivity. And we also want
to do all this so that we can develop next generation of technology - that it's better
suited to handle some of these issues.
PIM is hard to study because of the personal part of it. You cannot do control lab studies.
For example, if you ask me to do a study on how I handle my email, I can't put you in
my inbox. The people that send email to me - you don't recognize; you don't know if
they're critical; you don't know if they're urgent. Where I file my emails you have no
idea where they are.
So whatever you do in that task is gonna be very, very inefficient and just totally irrelevant.
So personal information management has to be studied within the context of your personal
information.
So if I'm gonna do a study on how you do email I have to somehow: one, get you to sit down
and do your email in a place where I can see; two, I have to sort of worry about privacy.
I don't wanna see emails that maybe don't belong to a public view of your personal stuff.
But at the same time, I have to do that to study it in the right context. So it is very
challenging. It tends to be done in this idea of diary studies. We - there are a lot of
studies that tend to just follow somebody along.
[background noise]
What's going on? Okay.
Follow somebody along for a week and have them questions on a daily basis. As opposed
to control lab. Now, we can do some control labs in some sort of smaller parts and I'll
show you one example where we did one of those.
The other part of PIM that's sort of interesting and hard to study it that people develop strategies
to manage their information. If you look at how you handle your email, it's probably very
different than the person sitting next to you. Some of you might use tags on Gmail,
some of you might use folders, some of you might use smart folders on a mail app on the
Mac. Every single person has a different strategy; and it's a strategy that is very heavily dependent
on the context of your work.
Manos and I have talked about email organization for over the last, I don't know, four years
and we both sort of have it under control and they're both very different. And his works
well because of the type of emails he gets. I wish I could do it the way he does it -
I can't. And he probably has changed now - we haven't talked. You've probably changed
the way you do it now since you're out of school.
That's one of the things we found is that in life transitions, changes your strategies.
So the strategies are very, very, very much personal. It's a combination of life, context,
person, and the type of tool you use. And, you know, changing from being a student to
professional changes a lot of the types of information you manage.
And again, so that's another reason why you have to study - and not only in a very personal
basis - but personal basis within a particular context of that person's life.
A really brief history of PIM research. A lot of the sort of foundational studies of
on PIM happened in the '90's. PIM as a phrase was mentioned on Lansdale's book in '88, but
it really didn't start as a collection of researchers - identify themselves as PIM researchers
- until 2001 or so. That's when William Jones at Washington started doing some research
on calling it specifically PIM; and then he led a workshop in 2005 which led to a special
issue in CACM and led to another workshop; led to a book; and then led to another workshop,
etc. and so forth.
The last two or three workshops, actually the last four workshops, Virginia Tech has
been involved heavily, either me or Rob Capra, who was another Ph.D. of mine or Manos Dungare
directly involved with the organization; directly involved with presentations and having papers.
And there's plan, there's still plan for a conference in 2011. So, that's still sort
of in the air. The last workshop was few months ago in Vancouver
and Manos presented remotely one of his dissertation papers.
So, I'm gonna go over the finding, keeping, organizing just to give you a flavor of the
kind of things that we research in PIM.
In the Finding Information we study how people search and browse for information as a way
to locating locate information they need to accomplish some particular task. Because PIM
is very focused on the personal aspect of it, we focus on finding for personal use or
finding in your personal store. We call the PSI, personal store information. Sort of the
information you've collected. You know, the bookmarks, the history cache
and all that. It's my personal store and you can search just that by using tools like the
ones you got get on your desktop.
We also find information just by encountering it. And this are one of those things that
are difficult. In your day to day browsing you might seem something on the side that
you find interesting, and you follow that trail and you find some information that is
relevant, but you weren't in the way to get that information. And those tend to be more
difficult to get back to them, because you sort of stumble onto a trail that led to the
information. You weren't explicitly looking for it.
We also have this notion, Jamie Teva Teevan described it - of orienteering and teleporting.
Orienteering is a search behavior that you don't exactly know where you are going, but
you're browsing pages and making almost local decisions. On every context you look at the
page and you go: "It's this choice." And then on the next page you go: "It's this choice."
And it's similar to the notion of orienteering. If you are in the woods - that there is no
path - but yet at every step you look around - you go: "Let's go that way" - "Let's go
that way" - and so forth.
And it's sort of counter or juxtaposed to the notion of teleporting which Jamie described
as jumping directly to a goal. And our work has modified that a little bit with the issue
at the bottom of information that there are stability addressability. What we describe
teleporting as, is knowing exactly where you are going even though you're not jumping directly
there.
And orienteering is, you don't know where you are going - at every step you have to
make decisions as to what the next step is.
And a good example of teleporting that we use is, if you Google something and you tell
somebody: "Well, Google my name and it's the second link that shows up on the page." So
you're gonna make a decision of clicking on the second link, but you know ahead of time
it's the second link before you even see what the second link is. So that is sort of a teleporting
type behavior that you know how to get there - yes there are actions - it doesn't mean
you type the URL that you get directly to the end - there are actions that you go on
the way there.
And we tie that to this notion of information addressability. We say that information has
an address - meaning that you have a way to specify how to get to the information. Now
that doesn't mean, again, it's not that I know the URL, you might know the address of
how to get to - for example, Manos' phone number, by saying Google his name; his webpage
will come up; click there; click on personal; and his phone number is there. That is a full
address to the information because it unequivocally gets you to the information you want. It doesn't
mean that you don't have to do some steps to get there - but it does mean that you don't
have to think about it. It's almost just sort of mechanic to get to it.
The way we think of that is that sometimes you have partial information and that's where
search engines come in. A search engine is a tool that you give it partial information
and it gives you some more information that you make decisions and then navigate to them.
So we think of partial information and searching as going together. Searching tends to be used
use with partial information. If you had the full information, you wouldn't need a search
engine to get to the information you want.
[pause]
The other thing that has been studied is this idea of strategies that work over different
collections. So finding a file and finding email is different. Even though you could
use search for both - largely because we remember different things.
There are studies that have shown that people remember the sender and the recipient of emails
very well and that becomes a very prominent factor for which to use for finding things.
For files, a lot of times file name is not the most discriminate factor and a lot of
people don't remember the file name. A lot of people remember content or remember, you
know, "it's a file that I was working on last week, I don't remember what I named it -
Proposal I, or something like that."
For email that's not the case. Email you know that it's an email that somebody sent to me.
So you can query on that and then just sort of browse until you find it.
So the behavior across different collections is different, which makes this finding information
sort of interesting because particular strategies that work on one don't work on the other.
So once you have information, you have to make a decision: "Do I keep this information?
Do I just somehow accumulate it in some form? Do I bookmark it? Do I copy, do I print it
to PDF and save it on my hard drive? What do I do with the information?" And the information
- the decision is largely based on: "Is there a value of keeping this information for future
use?"
A lot of times the information is out there - you can just look for it again - and so
there's no need to keep it. A lot of times the information is sort of temporary - like
today's temperature. Well, you don't want to keep that because tomorrow it's irrelevant.
But overall you end up with a lot of information that you have to make a decision that you
want to keep or not and that starts to build up.
William Jones has made this nice diagram that says that the, if the information is useful
and you make a decision to keep it, you've done a successful decision. If it is useful
and you don't keep it, then you have a miss, because later when you have to use that information
you have to go finding for it again. If the information is junk, and you decide to keep
it, you have a false positive and it's gonna affect finding information in the future,
because now it's gonna be among the things that you kept that you don't need. If the
information is junk and you don't keep it, you made a correct rejection.
And this is the issue, here, that you have to keep in mind when you're doing this and
when you're researching it that keeping information is not costly, but it affects the cost of
refining down the road.
On the other hand, a lot of times you have information that you don't know is it's useful
- it's what's called post value recall - it's information that at the time you saw
it, who cared. And then later you go, "You know" - like me going to San Francisco coming
here for a conference - well a month ago I might have seen some review of a restaurant
in San Francisco, and I wasn't thinking of San Francisco so I didn't care. Now I'm in
San Francisco and realize: "Oh you know, I saw a review - a good place - and where did
I see that?" I didn't keep it - now it has value - value that I didn't know at the time
I made the decision, so it's impossible to manage and keep that information.
So the keeping decision is not completely a local decision. Sometimes it's a sort of
future decision that you have to guess: "Is this gonna be useful in the future?" There
are people who think that you should keep everything. Well just keep the cache big enough
- storage is cheap - just let a search engine find it. And that could be a possibility and
there are people - and I'll get back to that later in another slide.
So, but it is an issue for decision - it is an issue for research.
Organization is difficult for a couple of reasons: one, it is actually cognitively demanding
to decide where you file something. We tend to think - we've been trained to think in
categories that tend to be exclusive - so things live in only one place.
When it comes to digital information, there is no reason why that's the case. Furthermore,
it takes more effort for you to think: "Okay. Does this belong under my professional presentations
or it belongs under my trip to San Francisco?" And the answer is probably, both.
You don't really have to make the that decision on one or the other.
File systems force you to make that decision so the demand, the task is a little harder
and then it becomes also an issue of: "How am I gonna reuse this information?" You need
to get back to that, for example, finding filing the slides from a presentation, I put
them a under folder called Presentation under a travel folder for San Francisco. And I'm
actually in San Francisco for another purpose so it's under a folder of Grad School Presentation
at something or other. It made sense at the time of organizing this, because of the time
factor. I bet you six months down the road, I'm gonna go looking for the slides, and I'm
gonna go crazy and not find them. Because I'll probably go looking for them in a folder
of presentations given at, or something like that. And that's not where it is.
So, and you don't want to spend way too much time linking all these things and connecting
them. You almost want to just dump them someplace and just find it somehow.
And actually there are people like Manos that names files by key words he will use for searching.
The file names are like this long, separated by comma or spaces, and they're almost irrelevant,
but there are presentation, Google, grad school, council, conference, San Francisco, and it's
just a bunch of words that are all the possible ways of how you gonna go back to it. And that
might seem weird, but it seems perfect when you think refinding that file later.
The other problem that happens is fragmentation. You end up with things in multiple places
and you look in the other place or you look in one place and it's in the other one.
There are a lot of strategies, and I'll come back to those in a second, that have been
identified for files and a lot of them for email. Filer vs. pilers.
Filers are people that methodically put everything in a folder someplace. Pilers are people who
that just leave a bunch of icons on your desktop. It's the one that the desktop looks like it's
over floated.
You have spring cleaners, which are people that sort of pile for awhile and then they
sit there and file everything in one day. It's like a spring cleaning of your house.
There are people that do filing; there are people that do tagging if the system supports
it. And one of the benefits of this structural organization - 'cause if you think about it
- filing structurally everything I said is negative. Well, there is a benefit to it.
The benefit to it is that as you navigate your file structure to get to a file, you're
rehearsing the organization and it becomes more in your head and then it becomes easier
to say: "Oh. That goes tin, tin, tin, tin. And that's stored in this that's down the
road."
So there is a benefit to a structure organization, because it helps you rehearse your own organization
mechanism.
So here's what I was saying. There are different strategies on how you organize things. There
is no right strategy. There is no good strategy. It all depends on a person by person. You
have the file and organize everything. You have the organize nothing and just search
for it.
And I was more of the file and organize everything until I had this very weird experience. I,
I was looking for a paper, and I literally Googled the title of the paper. The first
hit that came back was my own server. I basically had accumulated a bunch of PDF's of some research
paper in a folder on my server. The folder was visible publicly; the Google engine had
crawled it; and here comes the link perez.cs.vt.edu. I'm thinking: "Oh, don't I feel like a moron?
I am accumulating things and organizing it and then don't know where it is and Google
for it and find it in my own space." And it was interesting 'cause brought to me very
clear to my head that maybe I shouldn't organize anything. I mean if I can put things in a
place where a good search engine can pick them up, who cares where it is?
I mean it, you know that's, that was a very extreme example that I from Virginia I Google
and it came back to my server that was sitting like next to my leg. And the funny thing is
that I had no idea it was there. So it, it sort of changed a lot of my information practices
to realize: "Okay. So the search engines are getting to a point that are good enough they
can sort of help me find stuff to the point that I don't actually have to organize anything;
or at least have to organize loosely.
There are different groups that have explored this. The Life Bits Group and Microsoft believe
that you should save everything. Literally, everything. They actually have, like video
cameras that tape your whole life and archive 24 hours of video of your life. They actually
have audio recording of your phone conversations and like archive them and then they - the
plan is to eventually transcribe them and search them and all that.
I think that's a little bit extreme. There are groups that believe you have to have everything
structured and they are very formal in terms of techsonomies and anthologies that seem
more for a librarian than for a normal person.
There are people who say: "You need to do a lot of unification so that Manos on my email
is connected to Manos the co-author of a paper in my file system, and we know they're both
the same person, and so forth.
And there are people that just say: "You know, just dump them someplace. Who cares? Just
search. The search engine will pick it up and will find it for you."
The tagging in search is sort of a loose organization because it's loose in the sense that you're
just putting some label and dumping in every it in a big bucket and then just letting search
find it.
Email organization is very - it's interestingly, you know, it's similar but different in the
sense that we have until fairly recently, we had lots of really bad email tools. IMAP
I think solved a lot of the problems with email. So the most common problem you hear
with email is: "Oh, that's on my home computer." And that only happened because he had a pop-client
that downloaded it and now you can't access it somewhere else. So that changed a lot of
people's practice once you move to an IMAP. Now all your email is everywhere. Now you
can put effort into filing it because you're filing it in a place that you'll find it later
on another computer.
Tags with Gmail and some of the other systems that support tags also are provided in that
so one of the things that people do in email is this idea of prioritizers and archivers.
People mark the emails or label them as: "I need to work on this" and then archive the
rest. And that stays in your inbox or stays in some sort of inbox.
I used to be a heavy filer on email and a number of experiences convinced me to ignore
that. I'm sort of on the priority and archiver now. All my emails are archived in one folder
called "Archive." I don't care who they are, where they come from. I don't do anything
to them. Don't color them, tag them or anything. They stay in the Inbox until I handle them.
Once I've handled them - archive. Out of the way.
And the search tools on the email programs are good enough that you can just go find
them without any problem. So you save a lot on the organization at the cost of every now
and then an email being a little hard to find.
So now I'm gonna tell you a couple of studies that we've done at Virginia Tech over the
last few years.
The first one is the Refinding Study done by Rob Capra. We're interested - we were interested
in, in the idea of refinding. The idea of finding information you've already found.
And finding information particularly on the Web.
He got into this project through a very sort of wacky and innovative system. He built this
thing called Web Context, and it was a caching mechanism for a Web browsing experience that
would then pull out key words out of all the web pages you've seen, and builds a phone
interface for you to query pages and pull out addresses and phones. And the idea was
the example I gave earlier of a restaurant in San Francisco, because he often had the
experience of: "I'm going to a conference." I make reservation and in the process of making
reservation, you find the area for the hotel; you see recommended restaurants; you see all
this information that at that point is irrelevant. Six months later when you're in that city,
because becomes relevant. But when you're in that city your computer is somewhere else
or you don't have Internet.
So what he would do, he would pull all the addresses and phones found on Web history
pages, and then he built a phone interface that used words and you would say: "Chinese
restaurant, San Francisco. I want the phone." And the system would search for the pages
and actually would read out to you the phone number for any Chinese restaurants that you
had browsed for.
So it was a very interesting study. It was a very cool system. Voice XML sort of got
in the way in a lot of things we were trying to do. But then he got interested in refinding
information and this whole idea of: "How do I go and find something that I already had
seen?" And there was this question of: "Is refinding different than finding." And intuitively
if you think about it, finding is very exploratory. You are looking for something that: "I'm sure
it's out there someplace, but I've never seen it. I don't where it is. I don't know where
it would be."
Refinding is this idea that there is something that you already saw, that you're certain
is there. You may remember the color of the page; you may remember it was in some government
website; you remember some things. The differences are enough for you to sort of have a different
search behavior.
So we conducted a study. We had 18 different tasks. We had two sessions. We invited people;
asked them to search for information. We invited them back, I think two or three weeks apart,
and we basically asked them to refind same or similar information. Right before each
task, we asked them: "How familiar are you with this task? And how frequently you do
this task?" And what we found was that familiarity and frequency turn out to be the most discriminate
factor in whether you can refind the information easily or not. Meaning, if you do the task
very frequently, then you can get back to the information faster.
The types of tasks that we did were: looking up a word in a dictionary; looking up a phone;
making a travel arrangement, meaning find a flight - the price of a the flight from
two cities - from one city to another; finding instructions for a DVR, you know, go to TIVO
and download the PDF for this particular model; sports score; movie times; stocks for a company;
the home page for a professor at a university; the weather for a city; and information for
a local restaurant, like what time does it open and things like that.
Interestingly enough, and contrary to what we expected, search engine did not, or I'm
sorry, search engine use did not come out to be a significant factor between finding
and refinding. People just use the search engine the same way they use the search engine.
If they had a problem finding the information at the beginning, they had a problem finding
the information when they came back. If it was easy to find it, then it was easy to find
it on the second task.
Task type had an effect - as you can imagine - finding out the definition of a word versus
finding out the weather in a city are very different tasks and had an impact.
There were a number of tasks that were very hard to do. Well, there were two in particular:
one task was find the headlines - today's headline. While Two weeks later we say "Find
two weeks' ago headlines" and the sort of interesting thing we realize is that if you
go to any news site, the full default URL, CNN.com or ABCnews.com, shows today's news.
Well, yesterday's news are someplace in there, but they're not under today's back one day.
They're in some non-descript URL and often it's hard even to find them by date. It's
more by topic that you find them.
So it was really hard for the person to go back to last Monday's news. They didn't remember
the news that was; they remember they had seen it on Monday and that became a very,
very difficult task.
The other one that was difficult and was sort of interesting was, we asked people: "Pick
two sweaters you want to buy for a friend for a gift." We didn't give any indications
as to where or what type of sweater or anything like that; [pause] and on the refinding activity
we asked them to find one of those two sweaters. And what happened was, what we noticed was,
that a lot of the decisions were sort of the orienteering type of decisions. Like you see
two sweaters you go: "I like that one better." And you go on and it's a very local decision
that has no principle behind it. So when they came back they went like: "I don't remember
which one of these two I picked." So it became very difficult to actually follow the same
path to the information that they had found before.
I mentioned information there are similarities addressabilities as the same topic here.
We came up with this model. We came up with a model that says low frequency and low familiarity
refinding is really no different than finding. If it's something that, yeah, you saw it before,
but you've seen it once - you don't do it frequently - it's almost like just finding
something. And the behavior and the patterns and the search queries and all that are very
similar.
The moment you go up in familiarity, meaning that you visited that site several times,
then you start learning some patterns. You start remembering middle of the road points.
So if it's a page for a professor at Georgia Tech and you've gone there multiple times,
maybe you remember the URL of the department and you start there, as opposed to start at
the Georgia Tech page and then the Department and then the professor.
Once you cross over in frequency - familiarity and frequency mean short cuts. People that
do a task a lot of times will remember shortcuts by either bookmarking it or by either just
typing the URL. So a lot of people found weather. They went straight to the weather site. They
knew which one it was - they check weather all the time. So they went straight to the
weather site and just entered a zip code or the name of the city.
So familiarity and frequency sort of, it's you're now in the domain of, of learned shortcuts
or actually accumulated bookmarks or, you know, buttons on your browsers.
Then there is this strange, inherently difficult tasks. These are tasks that, that you do with
high frequency, but no matter what they're really hard. Going to yesterday's news is
really tough. And it's not so much anything in the task or the users, I think it's more
on the Web design. It's, there are just some tasks that are hard to do that doesn't matter
how many times you do them, it's just gonna be down in that other category. And that category,
I think, is very small. There are just very sort of odd things in there.
[pause]
Oh, one thing that we found that I think is worth pointing out - our study had just two
sessions. You come in and then you come back a few weeks later. We found that they went
from quadrant one to two after just one session. So the familiarity just having found something
once is enough to influence your access pattern and make it easier to find. You do less exploratory,
you know, the start spoke model of you find one place and you go to all the research links
and then you find another place. We found that just having done it once was enough for
them to remember two weeks later on how to get to the information.
[pause]
The next study I wanted to mention is calendar use. This was a study that was initially done
by an undergraduate student and then Manos and I picked it up. It has We had been sort
of gathering dust in my office for a while, and we sort of re- sort of cleaned it up
- did a little bit more data analysis; and there are a couple of things that are interesting
and we're still trying to get this published.
What we wanted to do was try to find out how people search or how people use calendars,
given that today's environment has changed so much from the time when calendar management
had been studied. Calendar management was studied in the, in the '90's by several people
and this was the days when you had big work stations in your office and you didn't take
them home; and you didn't access them from home either. So using a calendar meant running
a program on your Sun work station or something like that.
Since we've had personal computers, PDA's, now phones, now Web applications – so we
wanted to sort of see how things have changed. And this was done at a university - ninety-eight
survey participants - most of them faculty or more than half faculty. And then we did
16 in-depth interviews with them.
Here's some interesting findings. We found a lot of proxies for calendars. We found a
lot of people that print the calendar. We found a lot of people that wrote down notes
for the calendar and took it with them. We even found users that did that in spite of
having a Palm - this was a few years back - in spite of having a Palm device of or
some kind of hand held device.
We also found that calendars continue to be printed. That no matter what all the technology
gives you, that people have certain comfort on the paper calendar, even if it is a printout
of the computer calendar.
And to this day, my supervisor at the graduate school actually prints her calendar every
morning. You know, on a color printer; looks nice - all the color and everything - but
still prints it every morning - doesn't go back to the computer to see what my nine o'clock
is - the paper sort of is the one the more current.
We found that a lot of people keep family and work calendar separate. And particularly
family calendar tends to be a wall calendar posted up in the kitchen or something like
that. It was amazing how many people pointed that out - it wasn't just one or two.
The paper calendar continued to be of use for things that actually technology is not
helping and I'll go to that in a few seconds.
Paper trail - it's one of them. Annotations is another one. And this notion of opportunistic
rehearsal is another one.
Opportunistic rehearsal is when you look ahead in your calendar, and the more you look ahead
the more you remember what you have tomorrow at two o'clock. And a lot of the on-line calendars
don't have a good way for you to look ahead sort of casually without wanting to.
And actually Google calendar has one interesting feature that is the next four days. No matter
where you are in the week, they show you the next four days. That's a perfect example of
good opportunistic rehearsal - that a lot of calendars show you day or week. On the
week calendar if I'm on Saturday, I can't see next week. And if I go to next week, I
can't see today. It's sort of this artificial boundary of where the week ends. When reality
- today's today and I want to know what's happening tomorrow - and I don't care if it's
next week - it's tomorrow.
So this opportunity opportunistic rehearsal is one part that a lot of calendars don't
really take advantage of and make it difficult. So people prefer to print it by hand.
The other one is one that I actually, I do this a lot, is people use a calendar as a
memory aide for reporting purposes. You go back when you have to do your annual report
- you go back and see where did I go this year? What did I do this year? What conferences
did I attend? So you use it as a record of things you've done and as a memory of who
did I meet with, and when did I meet with such and such person, and things of that sort.
And again, that's another issue that, that I'm not sure on-line calendars do anything
in particular to help, other than the fact that it's still there. There is no sort of
global way of browsing old history which clearly is different than, than you know, the day
to day or month to month browsing of the upcoming events.
We collected calendar samples and I want to show a couple of examples here that are sort
of interesting. The one on the top left - this person had an activity and printed the
calendar and then proceeded to put a Post-It on top of it to capture to write - in this
case I think it was a potluck dinner - and the person wrote what she was bringing to
the potluck dinner.
So these are the type of things that paper calendars support, that on-line calendars
make it really difficult to do. It's the idea of adding some sort of generic note - not
that are hidden inside some place- but just something on top.
The middle one is similar. That was October 31st and the person wrote, you know, a pumpkin
to remind herself that she had to take the kids to go trick-or-treating.
The one on the bottom is interesting. This was a grad student who parked at night in
a place that he couldn't park during the day. So he put an alarm to move the car at eight
o'clock so it wouldn't get towed. So he was so confident that he was gonna be in the lab
at eight in the morning from an overnight working on a project, that he just had a recurring
alarm for every day "move car at 7:30." And again, it's sort of a weird, odd use of a
calendar. You think of calendar for meetings, maybe reminders - this is sort of a very strange
- off the mainstream.
So what implications did we come up with that for design? Paper trail. We had people that
wanted to know when they deleted an event. Okay? And there are two cases for that.
One is, I look at my calendar for Monday; I see I have something at ten; okay. Monday
morning I go there and it's not there. Now I got to wonder what happened. And my administrative
assistant has access to my calendar. So she could have deleted it. But the question is:
"Did she delete it? Did she move it somewhere else? Did I look at the wrong week? Am I looking
at the wrong week?"
Paper calendars if you erase that you will see the mark of the erasure and it gives you
a confidence that: "Yeah. That was erased. It's not there anymore, but it was there."
And that is something that gets lost in the digital is this evidence of use and wear and
tear that disappears.
A similar to that, is this idea of tentative event scheduling. Very often you are told:
"Well, separate Tuesday at two, Wednesday at three, and Thursday at eight for a meeting.
It will be one of those three. Let me coordinate with everybody." Well you might mark all three,
but then you delete two of them or you delete only one and forget to delete the other, and
then you show up at the meeting the wrong date and realize: "Oh, I forgot to delete
that. That was a temporary one."
And what you have is you have multiple appointments that is almost like an exclusive or - you
really want only one of these - and at some point you want to confirm: "This is the one
and the other one should disappear." And when you go electronic, that option is not there,
but it's a common practice that is unsupported.
And needless to say, intelligent alarms - how many of you have calendars that go off
on your phone, your laptop, your desktop, at home all at the same time? They're all
beeping and going crazy. I've stopped using alarms because of that. Because they all beep,
and actually they all beep at about the same offset from each other – it's like a minute
off - so my IPod will go off first, then my cell phone goes off, then my laptop. So I've
just stopped having alarms go off. Which then you lose the alarm. You lose the feature of
the alarm going off and alarming you because you get annoying half the time because it's
going around.
So moving to sort of more of the future work that we're doing. So knowledge management
in organization is this idea that there is knowledge in the organization that is sort
of not directly imbedded in the mission statement or in the financial report. It's in the people
that work in the organization and they - a lot of organizations struggle to try to
capture this knowledge. You want to have that information so that if one employee goes,
the next one that comes in can pick it up sort of quickly from where they were.
What we're looking at is, we're looking at this idea that knowledge management can be
a byproduct of PIM practices. Knowledge management can be, in an organization, can be a sort
of side effect of how you do your email, how you do your filing, how you do your other
things.
And we're trying to find ways that that can lead to social collaboration, to identifying
experts in the organization by just looking at traffic or where the questions go. And
to help shape the awareness of audiences. A lot of research on communicating on list
serves says that people don't participate because: "I don't know who's receiving the
email. So I'm afraid to say something that might offend somebody." And that's part of
this issue of organizational management.
So the big research question is: how do my personal practices benefit others in the organization?
[pause]
And that is - I'm sort of trying to stay away from the big social part. I'm trying to focus
on the organization because there is a lot of things in common between you and the person
that sits next to you. Whereas, there isn't that much in common between you and your best
friend, because your best friend might work in a completely other industry and the types
of emails or documents they manage is just completely different.
So when finding information is no different than personal information finding, but in
the organization you have these other situations where people find information for you. You
get emails from people saying: "Oh, I found this and I thought of you. Here." Boom. And
they push it onto your face. Or they ask you questions about things you've found that now
they want. "You know, at the meeting you mentioned this last year's report. Where is it?" So
now you don't only have to find it, but you have to find it and make it available to others.
So I think we have more ways of finding information in our organization because now part of it
is this sort of sort of this collaborative information finding. We also have different
ways of organizing information and this is where the personal starts moving to the centralized.
At the graduate school at Virginia Tech, we have a central file share and I just absolutely
hate it. I don't use it because it's the union of everybody's file systems. So where my things
reside is like buried someplace in multiple areas and things of that sort.
So there is a folder for public presentations and then there's different people inside and
then they're inside. And it is sort of strange to me it's very unnatural to use that and
still keep it in sync with my file system. They don't match structure wise. Because I'm
in CS, I can refuse to use it and explain technically why I don't like it. The rest
of the people don't have that luxury. You have to put it over there. It's not there,
you have to put it over there.
And it becomes really weird. It becomes very strange. It's almost like - it's a problem
looking for a solution. [coughs] It's also strictly hierarchical. You have to make a
decision: "Okay, I have a form. This is an admissions form, it's also a form, it's also
this, it's also the other - we have versions problems - okay, this is an update - oh, I
didn't put a number, etc, etc." So it gets very frustrating and it has a very high cognitive
demand, very quickly.
The other part that we're looking at, is how to leverage this social graph of people. Imagine
all the people you're connected to within the organization. I don't want to - again,
I'm trying to stay away from the Facebooks of the world.
Implicitly you can use a social graph within the organization to improve your PIM practices.
For example, email providers use aggregate view of email traffic at cross users as a
way to identify spam. If a particular email has a few key words, and it went to everybody
on site, you have a very high certainty that it's spam, and then you cannot provide spam
filter to individual users that are heavily influenced by the collective users.
So we can think of the same thing in terms of organizing your email. In terms of organizing
your files. Where do you file that attachment that I send you might be relevant to where
I want to find it.
Explicitly, on the other hand, if we could capture the strategies, we can build a community
of sorts. We can say this is how person x handles their email and you can sort of -
it's almost like a skin that you would apply to your email program that would create some
smart folders; create a set of tags; and it would create some instructions that says:
"When you get this email do this and put it over here." And then you can share those practices
through the organization. You can do it automatically because the tools does it automatically, like
spam does. Or you can do it manually by allowing sort of a community of practice to build some
of the strategies and accumulate them someplace and you download them when applying to your
Now on this idea of your social graph, imagine if we could capture the friends of a friend
network from work, from your neighborhood, from your Facebook site, from you know, your
high school friends, your personal, social life and all that, and you unify around people
and then apply something like the page rank algorithm, but a friend rank so that people
that are higher importance show up higher on that graph. Then you can use that as a
way to capture whose email is more important than others.
If we could do this and save it and manipulate it, then you can temporally update it. For
example, this week emails from Manos to me are of high priority because he was gonna
pick me up this morning. I didn't want to miss that email in the middle of 20 other
emails.
Last week I was working on a proposal with three professors - emails from those three
should go above the top of the queue. They are the most important one. So if you can
manipulate the importance of people that can influence your PIM practice when you're working,
implicitly you can use it to rank emails and so forth.
So the idea - the research question is: "Can we capture and share the strategies for use
on email?" And that's sort of something we're working on. I'll give you one quick example
of that.
We are looking at sort of applying email tags across the email client. So if I send somebody
an email and I tag it, the software on the other end says: "The sender had this tag.
Would you like to use it on the email too?" It turns out that the most appropriate person
to tag an email is the sender. The receiver has to read the email before to decide what
tags to use.
Now you can send the tags along without revealing anything private to the other person. You
can anonomyze them. Tag I, Tag II, Tags III. And you can also then share them back and
forth and you can have this notion of sending emails and then sending back as suggestions
of how to organize it. And when you put together the collaborators across, you get a richer
set of tags that if you sort of do unification much like site you like or Delicious does
and say: "Yo. People that like that bookmark also like this other one." Without understanding
what they is, just because there is a common link between them; then you can have people
that tag this file with this tag also - or email - also tag this email with the same
tag.
So you can do this collaborative tagging in an organization and leverage the heavy organizer
in the group to help in information organization to the non-heavy organizer in the group. And
we're currently building this. We have a design - an architectural design to actually work
with Gmail and that, and we're hopefully over the next few months we'll gonna take a crack
at it and see how it goes.
My sort of conclusions - I have another slide at the end that I want to touch quickly.
Information overload is here and it is killing us. We sort of need to find a way to do things.
I think Bill Gates has been quoted as saying that he's become a librarian and he hates
it. So he wants people to solve this problem. He doesn't want to spend so much problem organizing
his information. He wants to do work.
I think we're way beyond the point of - we spend more time organizing than we need to.
PIM is sort of studying exactly that problem. How to organize the information; how to make
useful. We're pushing it now to see if organizations can benefit from that angle, and are we gonna
help people organized information; help people identify experts; and provide some added value
to employees.
Because this talk was co-sponsored with the Hispanic Googlers Network, they asked me to
say something brief about my life history, so I'm gonna give two seconds of this. I think
some of you might find it interesting.
There are many career paths. My high school yearbook says I was gonna to a lawyer.
[laughter]
And I actually went to college to study law. Why? Because my father was a lawyer. My Mom
was a university professor. I was really good at math in high school, and as a mathematician
your choices are limited as to what career you get; at least that's what I was told.
And I sure as hell did not want to become a university professor. That sounded boring.
So math wasn't a choice. My Mom was a professor - I knew that was sort of eh, eh. So it is.
And sure enough I ended up being a university professor, mostly by completely pure chance.
I ran out of the math courses in business. I was being a minor in business; took all
my math courses; ran out of them really quick. My English wasn't that good, so I wanted more
math courses. Those are numbers - I understand those without knowing English.
And my advisor said: "You want to take this four-term FORTRAN programming thing?" I said:
"Sure. Does it have a lot of English?" "No, it's mostly formula." "Ah." Needless to say,
that was it. And halfway through the semester I went back to him and said: "Can I do a major
on that?" He said: "Sure you can. It's called Computer Science." "Really. Oh. Cool, let's
do that."
And I switched to a minor in business and a major in CS and then from there on, you
know, one thing led to the other. I ended up getting a Ph.D. And I'm a boring university
professor now.
So there are many careers path that you follow that are non-traditional, non-standard; none
of them are better than the others. Sometimes we, as minorities, we sort of feel bad because
everybody went to Stanford and have Ph.D.'s from MIT. It's like, no.
The other one that I find a lot of people have a hard time with, is that everything
is relative. I get asked this question: "Do you speak a foreign language?" And I say,
"Yeah. English." And people sort of have a hard time going, "No, no, I mean a foreign
language." I say: "Yeah. English." I did not grow up speaking English. English is a foreign
language to me. English is a second language to me. And it's "No, no, but I mean, you know,
other than English." I said, "Spanish is not a foreign language to me. It is to you, but
not to me." And people have a hard time understanding that.
My wife once was asked: "Have you ever lived abroad?" And she says, "I do now." I mean
abroad means you're living away from your family; you're living in another country.
We grew up in Puerto Rico. This is abroad for us. So it's all relative. People have
a hard time with that here in the United States. I think all of you understand that. In technology
we have a lot of people from international areas.
Find mentors that don't care why they have to give you advice. A lot of times we get
advice from the wrong people. And this one I'm not gonna get into it. We're gonna have
a session in the afternoon at 1:30 to talk about diversity. That's what I do at the grad
school. It's at this room. If anybody's interested in coming and talking, it will be an informal
discussion. We can talk about that.
So I'll stop there and take questions.
[audience talking to one another]
Okay.
No questions? Okay. So we're done.
Thank you.
[techno music playing]