Meaning of Project311 disaster data

Great East Japan Earthquake Big Data Workshop Project 311 Implications of Disaster Data Izumi Aizu (Institute for InfoSocionomics, Tama University; Information Support Pro Bono Platform) Kazuhiko Tada (Tono Magokoro Net) Shinya Saka (Plus Alpha Consulting) I'll be presenting the findings of our team, which includes Mr. Tada from Tono Magokoro Net and Mr. Saka from Plus Alpha Consulting. We looked at data from different media organizations and Twitter, rendered here in a very basic way by yours truly. We narrowed our focus to three locations: Kesennuma, Rikuzentakata, and Otsuchi. This is NHK's coverage. There is comparatively more reporting on Kesennuma, starting right after the earthquake, before the tsunami, since there were stationary cameras there. Otsuchi appears in the news once or twice in the afternoon, but you can clearly see Rikuzentakata does not appear until much later on and gets little coverage overall. After mapping the data using Plus Alpha's "visual" text mining, we found a relatively high number of specific words like "current situation," "fire," "tsunami," and "footage" in the coverage of Kesennuma. From an objective viewpoint, it contains words that are somewhat specific. A fair number of words come up for Otsuchi. We find the names of places nearby, like Miyako, Ofunato, and Yamada, but very few words describing the specific situation of the victims. Rikuzentakata has even fewer words, as you can see. Here is JCN's news data for the first week. A quick glance tells us that there is little coverage of Otsuchi overall and comparatively more of Rikuzentakata and Kesennuma. This is news article data for Asahi Shimbun-- just the text-mined data because we had little time. In this case, we didn't see major differences in the coverage of Kesennuma, Rikuzentakata, and Otsuchi over the first week. But on closer inspection, it seems there are some differences. Then, looking at the Twitter data... These are just the original tweets from the first week. We haven't analyzed the geotags, though that's something we wanted to do. There were a lot of tweets about Kesennuma right after the earthquake, but not that many about Rikuzentakata or Otsuchi. This is not necessarily due to tweets posted by people in Kesennuma. The number also includes tweets by those who saw the footage on TV and realized Kesennuma was in serious trouble. There are, of course, a variety of factors, such as population and the number of people involved, so we can't jump to conclusions based on this data alone. When we look at the content of the tweets, at their semantic attributes, we find requests like "tell me," "hang in there," "stay safe," "contact me," "I want to know," and "please," expressing the worries of a vast number of outsiders with no information. The second main attribute is "can't," such as "can't reach," "can't contact," "can't confirm," "can't confirm my aunt is safe," "can't confirm on the internet." These are the two semantic attributes with the most tweets. As with the news, there is very little information that comes directly from the disaster areas or the victims there. We divided the tweets into those with firsthand, secondhand, and thirdhand info. Firsthand covers those who reported a personal experience. Secondhand covers info received from an acquaintance or family member by phone or e-mail. Thirdhand covers info from a third party like the media. So the bar on the left is for March 11th, while the one on the right is for March 17th. Firsthand information for Rikuzentakata... Sorry. This combines firsthand and secondhand information... Looking at just the firsthand info on the 11th gives us 2 tweets. Secondhand info: 4 tweets. Thirdhand info: 81 tweets. We actually read all these tweets... So we haven't analyzed the 12th, 13th, 14th, or 15th. The 17th had 4 tweets with firsthand, 80 tweets with secondhand, and 240 tweets with thirdhand information. So even after a week, there are hardly any tweets with firsthand information. The situation is practically the same for Otsuchi. Firsthand information: 0 tweets. And at the end of the week, 6 tweets. This is a close-up of the same data. Let's look at examples of firsthand information. Otsuchi doesn't have any for the first day, but there are tweets with secondhand information like, "Is everyone okay?? I'm at school right now, so I'm safe. I was able to contact people in Kamaishi and Otsuchi after the first quake." So there are tweets like this. One of the first reports from Rikuzentakata was, "I'm in Rikuzentakata. The Kesen River is flooding and houses are being washed away, but the internet is still working." The person with this Twitter ID doesn't post anything after that for at least the next week. That night, there's a tweet saying, "I can't come to grips with what's happening. I've witnessed a terrible tragedy." So, what we can't see from the data. There are things that never make it into the data, leading to blank areas in the information we have. So the question is how to see what we can't see. We can try to detect more using digital technology, of course, but the digital world can only go so far before it becomes ineffective. That's when we need to combine digital tools with analog methods. This means going to disaster areas-- getting there fairly early on and gathering information. We think this is where digital tools and special means of communication are needed. Finally, we plan to use our findings for investigative studies requested by people in the affected areas, and I'll put in a plug here for... ...use our findings as reference data. Thank you very much. Thank you. This is as much a question for Mr. Suzuki sitting next to me, but I think you're absolutely right. Like that comment earlier about how houses are marked with circles and x's. If this information can be shared, if we can get to that point, it will be very useful in other areas as well. I realize, of course, that doing so immediately would be very difficult. But how do you think we can move in that direction? Mr. Sato, chief of Kesennuma's Crisis Management Section, told me they posted tweets right after the earthquake. Their mobile phones worked for about 5 hours. But it's very hard to get feedback that way, so I think the development of a functional feedback mechanism is key. Obviously, the harder an area is hit, the less information you get from it. But we can at least airdrop information-gathering resources into these areas, along with police and firefighting forces. When there's a shortage... An advance information-gathering team has just been formed from 10,000 police officers. I think these cooperative efforts are vital. The less information you get from an area, the worse the damage. It's just as the professor here said: those are the areas one needs to focus on first. At first, when the tsunami hit and receded soon after, I thought we could go out and save people. But even after the waves subsided, it was hard to clear paths through the wreckage. The place was covered in it and still flooded with water. So sometimes, the damage is so great that you can't even get close to the victims. I understand that information gaps signal danger, but there's this paradoxical situation where you can't get to those places. But I think this could be applied for the following morning, or to immediately get to places that are accessible.