Situating Personal Information Management Practices within an Organization

>>Manos: Hello, everyone. My name is Manos and I'm an engineer on the third QI team. It is my honor to introduce my advisor and associate professor from Virginia Tech, Dr. Manuel Perez-Quinones. Today he'll be talking about his work in personal information management and how users' personal information management practices can be leveraged by the organization for knowledge management as well. So this talk is also part of the perspective as speakers' series which attempts to impart images by sharing their research and their life stories with everyone. All of us. So without much further ado - Dr. Perez-Quinones. Thank you. >>Dr. Perez-Quinones: And I'm not your advisor anymore. Come on. You graduated. We're colleagues. [laughter] Now having said that I have on my slides here Manos Dungare is my Ph.D. student. So I guess I'm in the same boat. So I want to thank Manos for inviting me and I want to thank the Google, the Hispanic Googler Network. I have a hard time saying that. It's a group of Hispanics here that, that saw my talk announced and wanted to co-sponsor it. And one of them is behind this screen here hiding. What I wanted to do is talk a little bit about personal information management research in general. Then talk about a couple of studies. Two particular studies that we've done in research and my apologizes to Manos for not having picked his dissertation as one of the two. And then move on to something that we're pushing the research into - how to do we sort of take advantage of the personal behavior and practices as a way to capture knowledge management? And that is work that is evolving now and it's starting and it's a little bit more speculatory and open. And then I want to close because the Hispanic group asked me to sort of give a brief life history or something around along those lines. Words of wisdom of sort. So, I'll, at the end, I'll say a couple of things on that. This is the world we live in today. We have just huge amounts of information. Lots of displays. Lots of digital information that comes to us in many different forms, and we are clearly overloaded. And the one statistics that I saw recently was that in 2005 there were 35 billion corporate emails exchanged per day. And a lot of workers are reporting that they're just feeling overwhelmed by just trying to manage the information. The other thing that is happening is that we have a proliferation of devices. We have handheld computers, phones, PDA's, game boxes and all of them have email, all of them have Web access, all of them are connected, all of them access your calendar - and it's starting to get to a problem where you - in some cases - you have to wonder: "Where did I put that stuff? I wrote a note to myself on one of my machines someplace." And I think that Cloud and some of the other things that are happening will alleviate that, but the current situation today is that you still have to do a lot of syncing information. These two things are things that we try to address in personal information management research, and in particular my research group at Virginia Tech has been looking at a lot of the handheld devices and small device work. So PIM research is about studying how people find information, keep it organized and reuse it down the road. It is very, very much targeted to the personal aspect of that. It's really much targeted to "How do I find my emails? How do I organize my emails? How do I organize my files?" Not the traditional IR type research of: "How do I issue a query that will result in the right document?" The research and the work has sort of been organized around this framework of finding, keeping, organizing, and reusing information. So a lot of the studies have been finding studies or making a keeping decision. What of the information that I've run across I decide to keep or not. And the reason to study is the previous two slides.. We are overwhelmed. We have a lot of information demands. Information demands attention which means you get distracted quickly from one thing and the other. That leads to a decrease in productivity. And we also want to do all this so that we can develop next generation of technology - that it's better suited to handle some of these issues. PIM is hard to study because of the personal part of it. You cannot do control lab studies. For example, if you ask me to do a study on how I handle my email, I can't put you in my inbox. The people that send email to me - you don't recognize; you don't know if they're critical; you don't know if they're urgent. Where I file my emails you have no idea where they are. So whatever you do in that task is gonna be very, very inefficient and just totally irrelevant. So personal information management has to be studied within the context of your personal information. So if I'm gonna do a study on how you do email I have to somehow: one, get you to sit down and do your email in a place where I can see; two, I have to sort of worry about privacy. I don't wanna see emails that maybe don't belong to a public view of your personal stuff. But at the same time, I have to do that to study it in the right context. So it is very challenging. It tends to be done in this idea of diary studies. We - there are a lot of studies that tend to just follow somebody along. [background noise] What's going on? Okay. Follow somebody along for a week and have them questions on a daily basis. As opposed to control lab. Now, we can do some control labs in some sort of smaller parts and I'll show you one example where we did one of those. The other part of PIM that's sort of interesting and hard to study it that people develop strategies to manage their information. If you look at how you handle your email, it's probably very different than the person sitting next to you. Some of you might use tags on Gmail, some of you might use folders, some of you might use smart folders on a mail app on the Mac. Every single person has a different strategy; and it's a strategy that is very heavily dependent on the context of your work. Manos and I have talked about email organization for over the last, I don't know, four years and we both sort of have it under control and they're both very different. And his works well because of the type of emails he gets. I wish I could do it the way he does it - I can't. And he probably has changed now - we haven't talked. You've probably changed the way you do it now since you're out of school. That's one of the things we found is that in life transitions, changes your strategies. So the strategies are very, very, very much personal. It's a combination of life, context, person, and the type of tool you use. And, you know, changing from being a student to professional changes a lot of the types of information you manage. And again, so that's another reason why you have to study - and not only in a very personal basis - but personal basis within a particular context of that person's life. A really brief history of PIM research. A lot of the sort of foundational studies of on PIM happened in the '90's. PIM as a phrase was mentioned on Lansdale's book in '88, but it really didn't start as a collection of researchers - identify themselves as PIM researchers - until 2001 or so. That's when William Jones at Washington started doing some research on calling it specifically PIM; and then he led a workshop in 2005 which led to a special issue in CACM and led to another workshop; led to a book; and then led to another workshop, etc. and so forth. The last two or three workshops, actually the last four workshops, Virginia Tech has been involved heavily, either me or Rob Capra, who was another Ph.D. of mine or Manos Dungare directly involved with the organization; directly involved with presentations and having papers. And there's plan, there's still plan for a conference in 2011. So, that's still sort of in the air. The last workshop was few months ago in Vancouver and Manos presented remotely one of his dissertation papers. So, I'm gonna go over the finding, keeping, organizing just to give you a flavor of the kind of things that we research in PIM. In the Finding Information we study how people search and browse for information as a way to locating locate information they need to accomplish some particular task. Because PIM is very focused on the personal aspect of it, we focus on finding for personal use or finding in your personal store. We call the PSI, personal store information. Sort of the information you've collected. You know, the bookmarks, the history cache and all that. It's my personal store and you can search just that by using tools like the ones you got get on your desktop. We also find information just by encountering it. And this are one of those things that are difficult. In your day to day browsing you might seem something on the side that you find interesting, and you follow that trail and you find some information that is relevant, but you weren't in the way to get that information. And those tend to be more difficult to get back to them, because you sort of stumble onto a trail that led to the information. You weren't explicitly looking for it. We also have this notion, Jamie Teva Teevan described it - of orienteering and teleporting. Orienteering is a search behavior that you don't exactly know where you are going, but you're browsing pages and making almost local decisions. On every context you look at the page and you go: "It's this choice." And then on the next page you go: "It's this choice." And it's similar to the notion of orienteering. If you are in the woods - that there is no path - but yet at every step you look around - you go: "Let's go that way" - "Let's go that way" - and so forth. And it's sort of counter or juxtaposed to the notion of teleporting which Jamie described as jumping directly to a goal. And our work has modified that a little bit with the issue at the bottom of information that there are stability addressability. What we describe teleporting as, is knowing exactly where you are going even though you're not jumping directly there. And orienteering is, you don't know where you are going - at every step you have to make decisions as to what the next step is. And a good example of teleporting that we use is, if you Google something and you tell somebody: "Well, Google my name and it's the second link that shows up on the page." So you're gonna make a decision of clicking on the second link, but you know ahead of time it's the second link before you even see what the second link is. So that is sort of a teleporting type behavior that you know how to get there - yes there are actions - it doesn't mean you type the URL that you get directly to the end - there are actions that you go on the way there. And we tie that to this notion of information addressability. We say that information has an address - meaning that you have a way to specify how to get to the information. Now that doesn't mean, again, it's not that I know the URL, you might know the address of how to get to - for example, Manos' phone number, by saying Google his name; his webpage will come up; click there; click on personal; and his phone number is there. That is a full address to the information because it unequivocally gets you to the information you want. It doesn't mean that you don't have to do some steps to get there - but it does mean that you don't have to think about it. It's almost just sort of mechanic to get to it. The way we think of that is that sometimes you have partial information and that's where search engines come in. A search engine is a tool that you give it partial information and it gives you some more information that you make decisions and then navigate to them. So we think of partial information and searching as going together. Searching tends to be used use with partial information. If you had the full information, you wouldn't need a search engine to get to the information you want. [pause] The other thing that has been studied is this idea of strategies that work over different collections. So finding a file and finding email is different. Even though you could use search for both - largely because we remember different things. There are studies that have shown that people remember the sender and the recipient of emails very well and that becomes a very prominent factor for which to use for finding things. For files, a lot of times file name is not the most discriminate factor and a lot of people don't remember the file name. A lot of people remember content or remember, you know, "it's a file that I was working on last week, I don't remember what I named it - Proposal I, or something like that." For email that's not the case. Email you know that it's an email that somebody sent to me. So you can query on that and then just sort of browse until you find it. So the behavior across different collections is different, which makes this finding information sort of interesting because particular strategies that work on one don't work on the other. So once you have information, you have to make a decision: "Do I keep this information? Do I just somehow accumulate it in some form? Do I bookmark it? Do I copy, do I print it to PDF and save it on my hard drive? What do I do with the information?" And the information - the decision is largely based on: "Is there a value of keeping this information for future use?" A lot of times the information is out there - you can just look for it again - and so there's no need to keep it. A lot of times the information is sort of temporary - like today's temperature. Well, you don't want to keep that because tomorrow it's irrelevant. But overall you end up with a lot of information that you have to make a decision that you want to keep or not and that starts to build up. William Jones has made this nice diagram that says that the, if the information is useful and you make a decision to keep it, you've done a successful decision. If it is useful and you don't keep it, then you have a miss, because later when you have to use that information you have to go finding for it again. If the information is junk, and you decide to keep it, you have a false positive and it's gonna affect finding information in the future, because now it's gonna be among the things that you kept that you don't need. If the information is junk and you don't keep it, you made a correct rejection. And this is the issue, here, that you have to keep in mind when you're doing this and when you're researching it that keeping information is not costly, but it affects the cost of refining down the road. On the other hand, a lot of times you have information that you don't know is it's useful - it's what's called post value recall - it's information that at the time you saw it, who cared. And then later you go, "You know" - like me going to San Francisco coming here for a conference - well a month ago I might have seen some review of a restaurant in San Francisco, and I wasn't thinking of San Francisco so I didn't care. Now I'm in San Francisco and realize: "Oh you know, I saw a review - a good place - and where did I see that?" I didn't keep it - now it has value - value that I didn't know at the time I made the decision, so it's impossible to manage and keep that information. So the keeping decision is not completely a local decision. Sometimes it's a sort of future decision that you have to guess: "Is this gonna be useful in the future?" There are people who think that you should keep everything. Well just keep the cache big enough - storage is cheap - just let a search engine find it. And that could be a possibility and there are people - and I'll get back to that later in another slide. So, but it is an issue for decision - it is an issue for research. Organization is difficult for a couple of reasons: one, it is actually cognitively demanding to decide where you file something. We tend to think - we've been trained to think in categories that tend to be exclusive - so things live in only one place. When it comes to digital information, there is no reason why that's the case. Furthermore, it takes more effort for you to think: "Okay. Does this belong under my professional presentations or it belongs under my trip to San Francisco?" And the answer is probably, both. You don't really have to make the that decision on one or the other. File systems force you to make that decision so the demand, the task is a little harder and then it becomes also an issue of: "How am I gonna reuse this information?" You need to get back to that, for example, finding filing the slides from a presentation, I put them a under folder called Presentation under a travel folder for San Francisco. And I'm actually in San Francisco for another purpose so it's under a folder of Grad School Presentation at something or other. It made sense at the time of organizing this, because of the time factor. I bet you six months down the road, I'm gonna go looking for the slides, and I'm gonna go crazy and not find them. Because I'll probably go looking for them in a folder of presentations given at, or something like that. And that's not where it is. So, and you don't want to spend way too much time linking all these things and connecting them. You almost want to just dump them someplace and just find it somehow. And actually there are people like Manos that names files by key words he will use for searching. The file names are like this long, separated by comma or spaces, and they're almost irrelevant, but there are presentation, Google, grad school, council, conference, San Francisco, and it's just a bunch of words that are all the possible ways of how you gonna go back to it. And that might seem weird, but it seems perfect when you think refinding that file later. The other problem that happens is fragmentation. You end up with things in multiple places and you look in the other place or you look in one place and it's in the other one. There are a lot of strategies, and I'll come back to those in a second, that have been identified for files and a lot of them for email. Filer vs. pilers. Filers are people that methodically put everything in a folder someplace. Pilers are people who that just leave a bunch of icons on your desktop. It's the one that the desktop looks like it's over floated. You have spring cleaners, which are people that sort of pile for awhile and then they sit there and file everything in one day. It's like a spring cleaning of your house. There are people that do filing; there are people that do tagging if the system supports it. And one of the benefits of this structural organization - 'cause if you think about it - filing structurally everything I said is negative. Well, there is a benefit to it. The benefit to it is that as you navigate your file structure to get to a file, you're rehearsing the organization and it becomes more in your head and then it becomes easier to say: "Oh. That goes tin, tin, tin, tin. And that's stored in this that's down the road." So there is a benefit to a structure organization, because it helps you rehearse your own organization mechanism. So here's what I was saying. There are different strategies on how you organize things. There is no right strategy. There is no good strategy. It all depends on a person by person. You have the file and organize everything. You have the organize nothing and just search for it. And I was more of the file and organize everything until I had this very weird experience. I, I was looking for a paper, and I literally Googled the title of the paper. The first hit that came back was my own server. I basically had accumulated a bunch of PDF's of some research paper in a folder on my server. The folder was visible publicly; the Google engine had crawled it; and here comes the link perez.cs.vt.edu. I'm thinking: "Oh, don't I feel like a moron? I am accumulating things and organizing it and then don't know where it is and Google for it and find it in my own space." And it was interesting 'cause brought to me very clear to my head that maybe I shouldn't organize anything. I mean if I can put things in a place where a good search engine can pick them up, who cares where it is? I mean it, you know that's, that was a very extreme example that I from Virginia I Google and it came back to my server that was sitting like next to my leg. And the funny thing is that I had no idea it was there. So it, it sort of changed a lot of my information practices to realize: "Okay. So the search engines are getting to a point that are good enough they can sort of help me find stuff to the point that I don't actually have to organize anything; or at least have to organize loosely. There are different groups that have explored this. The Life Bits Group and Microsoft believe that you should save everything. Literally, everything. They actually have, like video cameras that tape your whole life and archive 24 hours of video of your life. They actually have audio recording of your phone conversations and like archive them and then they - the plan is to eventually transcribe them and search them and all that. I think that's a little bit extreme. There are groups that believe you have to have everything structured and they are very formal in terms of techsonomies and anthologies that seem more for a librarian than for a normal person. There are people who say: "You need to do a lot of unification so that Manos on my email is connected to Manos the co-author of a paper in my file system, and we know they're both the same person, and so forth. And there are people that just say: "You know, just dump them someplace. Who cares? Just search. The search engine will pick it up and will find it for you." The tagging in search is sort of a loose organization because it's loose in the sense that you're just putting some label and dumping in every it in a big bucket and then just letting search find it. Email organization is very - it's interestingly, you know, it's similar but different in the sense that we have until fairly recently, we had lots of really bad email tools. IMAP I think solved a lot of the problems with email. So the most common problem you hear with email is: "Oh, that's on my home computer." And that only happened because he had a pop-client that downloaded it and now you can't access it somewhere else. So that changed a lot of people's practice once you move to an IMAP. Now all your email is everywhere. Now you can put effort into filing it because you're filing it in a place that you'll find it later on another computer. Tags with Gmail and some of the other systems that support tags also are provided in that so one of the things that people do in email is this idea of prioritizers and archivers. People mark the emails or label them as: "I need to work on this" and then archive the rest. And that stays in your inbox or stays in some sort of inbox. I used to be a heavy filer on email and a number of experiences convinced me to ignore that. I'm sort of on the priority and archiver now. All my emails are archived in one folder called "Archive." I don't care who they are, where they come from. I don't do anything to them. Don't color them, tag them or anything. They stay in the Inbox until I handle them. Once I've handled them - archive. Out of the way. And the search tools on the email programs are good enough that you can just go find them without any problem. So you save a lot on the organization at the cost of every now and then an email being a little hard to find. So now I'm gonna tell you a couple of studies that we've done at Virginia Tech over the last few years. The first one is the Refinding Study done by Rob Capra. We're interested - we were interested in, in the idea of refinding. The idea of finding information you've already found. And finding information particularly on the Web. He got into this project through a very sort of wacky and innovative system. He built this thing called Web Context, and it was a caching mechanism for a Web browsing experience that would then pull out key words out of all the web pages you've seen, and builds a phone interface for you to query pages and pull out addresses and phones. And the idea was the example I gave earlier of a restaurant in San Francisco, because he often had the experience of: "I'm going to a conference." I make reservation and in the process of making reservation, you find the area for the hotel; you see recommended restaurants; you see all this information that at that point is irrelevant. Six months later when you're in that city, because becomes relevant. But when you're in that city your computer is somewhere else or you don't have Internet. So what he would do, he would pull all the addresses and phones found on Web history pages, and then he built a phone interface that used words and you would say: "Chinese restaurant, San Francisco. I want the phone." And the system would search for the pages and actually would read out to you the phone number for any Chinese restaurants that you had browsed for. So it was a very interesting study. It was a very cool system. Voice XML sort of got in the way in a lot of things we were trying to do. But then he got interested in refinding information and this whole idea of: "How do I go and find something that I already had seen?" And there was this question of: "Is refinding different than finding." And intuitively if you think about it, finding is very exploratory. You are looking for something that: "I'm sure it's out there someplace, but I've never seen it. I don't where it is. I don't know where it would be." Refinding is this idea that there is something that you already saw, that you're certain is there. You may remember the color of the page; you may remember it was in some government website; you remember some things. The differences are enough for you to sort of have a different search behavior. So we conducted a study. We had 18 different tasks. We had two sessions. We invited people; asked them to search for information. We invited them back, I think two or three weeks apart, and we basically asked them to refind same or similar information. Right before each task, we asked them: "How familiar are you with this task? And how frequently you do this task?" And what we found was that familiarity and frequency turn out to be the most discriminate factor in whether you can refind the information easily or not. Meaning, if you do the task very frequently, then you can get back to the information faster. The types of tasks that we did were: looking up a word in a dictionary; looking up a phone; making a travel arrangement, meaning find a flight - the price of a the flight from two cities - from one city to another; finding instructions for a DVR, you know, go to TIVO and download the PDF for this particular model; sports score; movie times; stocks for a company; the home page for a professor at a university; the weather for a city; and information for a local restaurant, like what time does it open and things like that. Interestingly enough, and contrary to what we expected, search engine did not, or I'm sorry, search engine use did not come out to be a significant factor between finding and refinding. People just use the search engine the same way they use the search engine. If they had a problem finding the information at the beginning, they had a problem finding the information when they came back. If it was easy to find it, then it was easy to find it on the second task. Task type had an effect - as you can imagine - finding out the definition of a word versus finding out the weather in a city are very different tasks and had an impact. There were a number of tasks that were very hard to do. Well, there were two in particular: one task was find the headlines - today's headline. While Two weeks later we say "Find two weeks' ago headlines" and the sort of interesting thing we realize is that if you go to any news site, the full default URL, CNN.com or ABCnews.com, shows today's news. Well, yesterday's news are someplace in there, but they're not under today's back one day. They're in some non-descript URL and often it's hard even to find them by date. It's more by topic that you find them. So it was really hard for the person to go back to last Monday's news. They didn't remember the news that was; they remember they had seen it on Monday and that became a very, very difficult task. The other one that was difficult and was sort of interesting was, we asked people: "Pick two sweaters you want to buy for a friend for a gift." We didn't give any indications as to where or what type of sweater or anything like that; [pause] and on the refinding activity we asked them to find one of those two sweaters. And what happened was, what we noticed was, that a lot of the decisions were sort of the orienteering type of decisions. Like you see two sweaters you go: "I like that one better." And you go on and it's a very local decision that has no principle behind it. So when they came back they went like: "I don't remember which one of these two I picked." So it became very difficult to actually follow the same path to the information that they had found before. I mentioned information there are similarities addressabilities as the same topic here. We came up with this model. We came up with a model that says low frequency and low familiarity refinding is really no different than finding. If it's something that, yeah, you saw it before, but you've seen it once - you don't do it frequently - it's almost like just finding something. And the behavior and the patterns and the search queries and all that are very similar. The moment you go up in familiarity, meaning that you visited that site several times, then you start learning some patterns. You start remembering middle of the road points. So if it's a page for a professor at Georgia Tech and you've gone there multiple times, maybe you remember the URL of the department and you start there, as opposed to start at the Georgia Tech page and then the Department and then the professor. Once you cross over in frequency - familiarity and frequency mean short cuts. People that do a task a lot of times will remember shortcuts by either bookmarking it or by either just typing the URL. So a lot of people found weather. They went straight to the weather site. They knew which one it was - they check weather all the time. So they went straight to the weather site and just entered a zip code or the name of the city. So familiarity and frequency sort of, it's you're now in the domain of, of learned shortcuts or actually accumulated bookmarks or, you know, buttons on your browsers. Then there is this strange, inherently difficult tasks. These are tasks that, that you do with high frequency, but no matter what they're really hard. Going to yesterday's news is really tough. And it's not so much anything in the task or the users, I think it's more on the Web design. It's, there are just some tasks that are hard to do that doesn't matter how many times you do them, it's just gonna be down in that other category. And that category, I think, is very small. There are just very sort of odd things in there. [pause] Oh, one thing that we found that I think is worth pointing out - our study had just two sessions. You come in and then you come back a few weeks later. We found that they went from quadrant one to two after just one session. So the familiarity just having found something once is enough to influence your access pattern and make it easier to find. You do less exploratory, you know, the start spoke model of you find one place and you go to all the research links and then you find another place. We found that just having done it once was enough for them to remember two weeks later on how to get to the information. [pause] The next study I wanted to mention is calendar use. This was a study that was initially done by an undergraduate student and then Manos and I picked it up. It has We had been sort of gathering dust in my office for a while, and we sort of re- sort of cleaned it up - did a little bit more data analysis; and there are a couple of things that are interesting and we're still trying to get this published. What we wanted to do was try to find out how people search or how people use calendars, given that today's environment has changed so much from the time when calendar management had been studied. Calendar management was studied in the, in the '90's by several people and this was the days when you had big work stations in your office and you didn't take them home; and you didn't access them from home either. So using a calendar meant running a program on your Sun work station or something like that. Since we've had personal computers, PDA's, now phones, now Web applications – so we wanted to sort of see how things have changed. And this was done at a university - ninety-eight survey participants - most of them faculty or more than half faculty. And then we did 16 in-depth interviews with them. Here's some interesting findings. We found a lot of proxies for calendars. We found a lot of people that print the calendar. We found a lot of people that wrote down notes for the calendar and took it with them. We even found users that did that in spite of having a Palm - this was a few years back - in spite of having a Palm device of or some kind of hand held device. We also found that calendars continue to be printed. That no matter what all the technology gives you, that people have certain comfort on the paper calendar, even if it is a printout of the computer calendar. And to this day, my supervisor at the graduate school actually prints her calendar every morning. You know, on a color printer; looks nice - all the color and everything - but still prints it every morning - doesn't go back to the computer to see what my nine o'clock is - the paper sort of is the one the more current. We found that a lot of people keep family and work calendar separate. And particularly family calendar tends to be a wall calendar posted up in the kitchen or something like that. It was amazing how many people pointed that out - it wasn't just one or two. The paper calendar continued to be of use for things that actually technology is not helping and I'll go to that in a few seconds. Paper trail - it's one of them. Annotations is another one. And this notion of opportunistic rehearsal is another one. Opportunistic rehearsal is when you look ahead in your calendar, and the more you look ahead the more you remember what you have tomorrow at two o'clock. And a lot of the on-line calendars don't have a good way for you to look ahead sort of casually without wanting to. And actually Google calendar has one interesting feature that is the next four days. No matter where you are in the week, they show you the next four days. That's a perfect example of good opportunistic rehearsal - that a lot of calendars show you day or week. On the week calendar if I'm on Saturday, I can't see next week. And if I go to next week, I can't see today. It's sort of this artificial boundary of where the week ends. When reality - today's today and I want to know what's happening tomorrow - and I don't care if it's next week - it's tomorrow. So this opportunity opportunistic rehearsal is one part that a lot of calendars don't really take advantage of and make it difficult. So people prefer to print it by hand. The other one is one that I actually, I do this a lot, is people use a calendar as a memory aide for reporting purposes. You go back when you have to do your annual report - you go back and see where did I go this year? What did I do this year? What conferences did I attend? So you use it as a record of things you've done and as a memory of who did I meet with, and when did I meet with such and such person, and things of that sort. And again, that's another issue that, that I'm not sure on-line calendars do anything in particular to help, other than the fact that it's still there. There is no sort of global way of browsing old history which clearly is different than, than you know, the day to day or month to month browsing of the upcoming events. We collected calendar samples and I want to show a couple of examples here that are sort of interesting. The one on the top left - this person had an activity and printed the calendar and then proceeded to put a Post-It on top of it to capture to write - in this case I think it was a potluck dinner - and the person wrote what she was bringing to the potluck dinner. So these are the type of things that paper calendars support, that on-line calendars make it really difficult to do. It's the idea of adding some sort of generic note - not that are hidden inside some place- but just something on top. The middle one is similar. That was October 31st and the person wrote, you know, a pumpkin to remind herself that she had to take the kids to go trick-or-treating. The one on the bottom is interesting. This was a grad student who parked at night in a place that he couldn't park during the day. So he put an alarm to move the car at eight o'clock so it wouldn't get towed. So he was so confident that he was gonna be in the lab at eight in the morning from an overnight working on a project, that he just had a recurring alarm for every day "move car at 7:30." And again, it's sort of a weird, odd use of a calendar. You think of calendar for meetings, maybe reminders - this is sort of a very strange - off the mainstream. So what implications did we come up with that for design? Paper trail. We had people that wanted to know when they deleted an event. Okay? And there are two cases for that. One is, I look at my calendar for Monday; I see I have something at ten; okay. Monday morning I go there and it's not there. Now I got to wonder what happened. And my administrative assistant has access to my calendar. So she could have deleted it. But the question is: "Did she delete it? Did she move it somewhere else? Did I look at the wrong week? Am I looking at the wrong week?" Paper calendars if you erase that you will see the mark of the erasure and it gives you a confidence that: "Yeah. That was erased. It's not there anymore, but it was there." And that is something that gets lost in the digital is this evidence of use and wear and tear that disappears. A similar to that, is this idea of tentative event scheduling. Very often you are told: "Well, separate Tuesday at two, Wednesday at three, and Thursday at eight for a meeting. It will be one of those three. Let me coordinate with everybody." Well you might mark all three, but then you delete two of them or you delete only one and forget to delete the other, and then you show up at the meeting the wrong date and realize: "Oh, I forgot to delete that. That was a temporary one." And what you have is you have multiple appointments that is almost like an exclusive or - you really want only one of these - and at some point you want to confirm: "This is the one and the other one should disappear." And when you go electronic, that option is not there, but it's a common practice that is unsupported. And needless to say, intelligent alarms - how many of you have calendars that go off on your phone, your laptop, your desktop, at home all at the same time? They're all beeping and going crazy. I've stopped using alarms because of that. Because they all beep, and actually they all beep at about the same offset from each other – it's like a minute off - so my IPod will go off first, then my cell phone goes off, then my laptop. So I've just stopped having alarms go off. Which then you lose the alarm. You lose the feature of the alarm going off and alarming you because you get annoying half the time because it's going around. So moving to sort of more of the future work that we're doing. So knowledge management in organization is this idea that there is knowledge in the organization that is sort of not directly imbedded in the mission statement or in the financial report. It's in the people that work in the organization and they - a lot of organizations struggle to try to capture this knowledge. You want to have that information so that if one employee goes, the next one that comes in can pick it up sort of quickly from where they were. What we're looking at is, we're looking at this idea that knowledge management can be a byproduct of PIM practices. Knowledge management can be, in an organization, can be a sort of side effect of how you do your email, how you do your filing, how you do your other things. And we're trying to find ways that that can lead to social collaboration, to identifying experts in the organization by just looking at traffic or where the questions go. And to help shape the awareness of audiences. A lot of research on communicating on list serves says that people don't participate because: "I don't know who's receiving the email. So I'm afraid to say something that might offend somebody." And that's part of this issue of organizational management. So the big research question is: how do my personal practices benefit others in the organization? [pause] And that is - I'm sort of trying to stay away from the big social part. I'm trying to focus on the organization because there is a lot of things in common between you and the person that sits next to you. Whereas, there isn't that much in common between you and your best friend, because your best friend might work in a completely other industry and the types of emails or documents they manage is just completely different. So when finding information is no different than personal information finding, but in the organization you have these other situations where people find information for you. You get emails from people saying: "Oh, I found this and I thought of you. Here." Boom. And they push it onto your face. Or they ask you questions about things you've found that now they want. "You know, at the meeting you mentioned this last year's report. Where is it?" So now you don't only have to find it, but you have to find it and make it available to others. So I think we have more ways of finding information in our organization because now part of it is this sort of sort of this collaborative information finding. We also have different ways of organizing information and this is where the personal starts moving to the centralized. At the graduate school at Virginia Tech, we have a central file share and I just absolutely hate it. I don't use it because it's the union of everybody's file systems. So where my things reside is like buried someplace in multiple areas and things of that sort. So there is a folder for public presentations and then there's different people inside and then they're inside. And it is sort of strange to me it's very unnatural to use that and still keep it in sync with my file system. They don't match structure wise. Because I'm in CS, I can refuse to use it and explain technically why I don't like it. The rest of the people don't have that luxury. You have to put it over there. It's not there, you have to put it over there. And it becomes really weird. It becomes very strange. It's almost like - it's a problem looking for a solution. [coughs] It's also strictly hierarchical. You have to make a decision: "Okay, I have a form. This is an admissions form, it's also a form, it's also this, it's also the other - we have versions problems - okay, this is an update - oh, I didn't put a number, etc, etc." So it gets very frustrating and it has a very high cognitive demand, very quickly. The other part that we're looking at, is how to leverage this social graph of people. Imagine all the people you're connected to within the organization. I don't want to - again, I'm trying to stay away from the Facebooks of the world. Implicitly you can use a social graph within the organization to improve your PIM practices. For example, email providers use aggregate view of email traffic at cross users as a way to identify spam. If a particular email has a few key words, and it went to everybody on site, you have a very high certainty that it's spam, and then you cannot provide spam filter to individual users that are heavily influenced by the collective users. So we can think of the same thing in terms of organizing your email. In terms of organizing your files. Where do you file that attachment that I send you might be relevant to where I want to find it. Explicitly, on the other hand, if we could capture the strategies, we can build a community of sorts. We can say this is how person x handles their email and you can sort of - it's almost like a skin that you would apply to your email program that would create some smart folders; create a set of tags; and it would create some instructions that says: "When you get this email do this and put it over here." And then you can share those practices through the organization. You can do it automatically because the tools does it automatically, like spam does. Or you can do it manually by allowing sort of a community of practice to build some of the strategies and accumulate them someplace and you download them when applying to your Now on this idea of your social graph, imagine if we could capture the friends of a friend network from work, from your neighborhood, from your Facebook site, from you know, your high school friends, your personal, social life and all that, and you unify around people and then apply something like the page rank algorithm, but a friend rank so that people that are higher importance show up higher on that graph. Then you can use that as a way to capture whose email is more important than others. If we could do this and save it and manipulate it, then you can temporally update it. For example, this week emails from Manos to me are of high priority because he was gonna pick me up this morning. I didn't want to miss that email in the middle of 20 other emails. Last week I was working on a proposal with three professors - emails from those three should go above the top of the queue. They are the most important one. So if you can manipulate the importance of people that can influence your PIM practice when you're working, implicitly you can use it to rank emails and so forth. So the idea - the research question is: "Can we capture and share the strategies for use on email?" And that's sort of something we're working on. I'll give you one quick example of that. We are looking at sort of applying email tags across the email client. So if I send somebody an email and I tag it, the software on the other end says: "The sender had this tag. Would you like to use it on the email too?" It turns out that the most appropriate person to tag an email is the sender. The receiver has to read the email before to decide what tags to use. Now you can send the tags along without revealing anything private to the other person. You can anonomyze them. Tag I, Tag II, Tags III. And you can also then share them back and forth and you can have this notion of sending emails and then sending back as suggestions of how to organize it. And when you put together the collaborators across, you get a richer set of tags that if you sort of do unification much like site you like or Delicious does and say: "Yo. People that like that bookmark also like this other one." Without understanding what they is, just because there is a common link between them; then you can have people that tag this file with this tag also - or email - also tag this email with the same tag. So you can do this collaborative tagging in an organization and leverage the heavy organizer in the group to help in information organization to the non-heavy organizer in the group. And we're currently building this. We have a design - an architectural design to actually work with Gmail and that, and we're hopefully over the next few months we'll gonna take a crack at it and see how it goes. My sort of conclusions - I have another slide at the end that I want to touch quickly. Information overload is here and it is killing us. We sort of need to find a way to do things. I think Bill Gates has been quoted as saying that he's become a librarian and he hates it. So he wants people to solve this problem. He doesn't want to spend so much problem organizing his information. He wants to do work. I think we're way beyond the point of - we spend more time organizing than we need to. PIM is sort of studying exactly that problem. How to organize the information; how to make useful. We're pushing it now to see if organizations can benefit from that angle, and are we gonna help people organized information; help people identify experts; and provide some added value to employees. Because this talk was co-sponsored with the Hispanic Googlers Network, they asked me to say something brief about my life history, so I'm gonna give two seconds of this. I think some of you might find it interesting. There are many career paths. My high school yearbook says I was gonna to a lawyer. [laughter] And I actually went to college to study law. Why? Because my father was a lawyer. My Mom was a university professor. I was really good at math in high school, and as a mathematician your choices are limited as to what career you get; at least that's what I was told. And I sure as hell did not want to become a university professor. That sounded boring. So math wasn't a choice. My Mom was a professor - I knew that was sort of eh, eh. So it is. And sure enough I ended up being a university professor, mostly by completely pure chance. I ran out of the math courses in business. I was being a minor in business; took all my math courses; ran out of them really quick. My English wasn't that good, so I wanted more math courses. Those are numbers - I understand those without knowing English. And my advisor said: "You want to take this four-term FORTRAN programming thing?" I said: "Sure. Does it have a lot of English?" "No, it's mostly formula." "Ah." Needless to say, that was it. And halfway through the semester I went back to him and said: "Can I do a major on that?" He said: "Sure you can. It's called Computer Science." "Really. Oh. Cool, let's do that." And I switched to a minor in business and a major in CS and then from there on, you know, one thing led to the other. I ended up getting a Ph.D. And I'm a boring university professor now. So there are many careers path that you follow that are non-traditional, non-standard; none of them are better than the others. Sometimes we, as minorities, we sort of feel bad because everybody went to Stanford and have Ph.D.'s from MIT. It's like, no. The other one that I find a lot of people have a hard time with, is that everything is relative. I get asked this question: "Do you speak a foreign language?" And I say, "Yeah. English." And people sort of have a hard time going, "No, no, I mean a foreign language." I say: "Yeah. English." I did not grow up speaking English. English is a foreign language to me. English is a second language to me. And it's "No, no, but I mean, you know, other than English." I said, "Spanish is not a foreign language to me. It is to you, but not to me." And people have a hard time understanding that. My wife once was asked: "Have you ever lived abroad?" And she says, "I do now." I mean abroad means you're living away from your family; you're living in another country. We grew up in Puerto Rico. This is abroad for us. So it's all relative. People have a hard time with that here in the United States. I think all of you understand that. In technology we have a lot of people from international areas. Find mentors that don't care why they have to give you advice. A lot of times we get advice from the wrong people. And this one I'm not gonna get into it. We're gonna have a session in the afternoon at 1:30 to talk about diversity. That's what I do at the grad school. It's at this room. If anybody's interested in coming and talking, it will be an informal discussion. We can talk about that. So I'll stop there and take questions. [audience talking to one another] Okay. No questions? Okay. So we're done. Thank you. [techno music playing]