Public Health 250a - Lecture 19

Craig Steinmaus: Okay. So your brilliant instructor has asked me to come in and talk about retrospective cohort studies. Supposedly you guys have already talked about a couple of study designs, specifically prospective cohort studies. So what I'm going to do is I think the best way to talk about retrospective cohort studies is to put them in context relative to prospective cohort studies. I think that's the best way to sort of understand what's going on here and then we'll give you a few examples. Now, Art sent me a couple of his lectures and it seems like he gives you guys 20 million examples. So I'm not going to give you 20 million examples. I'll give you maybe like a million. No. I'll give you like 2 or 3, focus more on specific examples than a bunch of them. Then a few issues about I'm kind of hoping you guys just aren't going to be in the future reviewing other people's studies but maybe doing studies yourself, designing studies and writing up the study design and submitting grants and then actually doing them. So I'll give you a little bit of study design stuff. And then advantage and disadvantage of these particular study designs. Okay. So this is me right here. And I've actually been at the university for many, many years. Um, but you may not see me on campus all that much. I got another job at California EPA which I spend most of my time now. But I've been teaching here about 12, 13 years. And in the department of epidemiology. And I do research on arsenic and drinking water. And if you want to lodge complaints about this particular lecture, there you go. Send them right here. Okay. So retrospective cohort studies. Okay. So, again, to talk about retrospective cohort studies I'm going to talk about prospective cohort studies and I know you guys have already had a lecture on prospective cohort studies. Right? So I'm going to give you my view. Epidemiology can be a little bit complicated so it's always kind of nice to get two people's versions of these particular study designs. So maybe if you can't understand one you can understand the other. But also you know, again, I think like I said before, it's easier to understand retrospective cohort studies if you put them in the context of prospective cohort studies. Compare the two. Okay. So, let's talk about, well, first of all the words that I use when I lecture. Epidemiology in my mind is about trying to find out whether an exposure causes a disease. Now, other people use different words. They don't like that word cause. They use the word association. And they don't like the word exposure. They use the word predictor variable or independent variable. Whatever. We're all talking about the same sort of thing. And then they don't like, you know, we're not always looking at a disease, you know. We're looking at an outcome or maybe a dependent variable. So, again, when I say exposure, I mean any of these. And when I say disease, I mean any of these. So just keep that in mind. Art may use other words or your instructors may use different words. But I'm going to talk about exposure, does that increase the likelihood of a disease. That's what I'm talking about. That's the words I'm using. Okay. Prospective cohort study. In a prospective cohort study as you know, your first step is defining your cohort. So you're standing here, 2012, whatever today's date is, October 8th maybe. You're standing here October 8th, 2012 and you find your cohort. And your cohort has to have some people with the exposure variable and some people without the exposure variable. That's your first step. And here's some examples of what a cohort is. All children in Oakland, California. Anybody in my small village in Tanzania. You know, all the workers of Chevron. A group of people and some of them have the exposure that you're interested in and some don't. In a cohort study we usually start with, you know, there's always these different sort of variations of different study designs, but we usually start with people that are disease free. So no disease, but some exposed, some unexposed. And then we follow them forward over time to see who gets the disease and who doesn't. So we're starting here in 2012 and we're going to follow this group of exposed and unexposed people forward over time to see who gets the disease and who doesn't. This example I put five years. Ha ha. Seven years, 2020. But the bottom line is following forward. And then, again, each one of these guys is a subject and then people with the disease is the red. People without. So we follow the exposed and unexposed people and see how many people in the exposed group get the disease and how many don't. Key is going forward over time. Now, a good example of a prospective cohort study is the nurses health study. How many people have heard of the nurses health study? I think they actually now have a nurses health study II. But it was a big huge -- these studies are incredibly expensive in general. Big huge study, millions and millions of dollars that have gone into the study and basically it just followed a group of nurses over time. And they've actually looked at a whole bunch of different things. This particular part of the nurses, they have probably looked at thousands of different exposures and thousands of different outcomes over the years this has been going on. It's been going on since the 1970s. So this particular part of the study they looked at post-menopausal hormones. That means women that are going through menopause they take estrogen and progestin to help relieve some of those symptoms or at least they used to. So this was a study to see if those hormone pills during menopause increased the risk of, this is ovarian cancer. All right? So they had a group of women, some took the hormones and some didn't. And they started in 1976 and again, followed them forward over time. None of them had breast cancer in 1976. They followed them forward over time to see who got the disease and who didn't in the particular groups. Okay. Prospective cohort study. Okay. And I don't know if Art kind of went through this or not, but, you know, it's usually not so simple as I just showed in this previous slide. You know, usually it gets a little more complicated than that. In other words, they started with over a hundred thousand nurses. At the time only 24000 of them were post-menopausal. But as the time went on some of them became menopausal and they started taking the hormones. And then over time some of the women die or some may lose to follow up. So you have people coming in of the cohort and people coming out of the cohort so it does get a little more complicated than what I had initially shown. So just keep that in mind. And here's the results. You can sort of see the never users of the hormones here. Not very bright. But you can see the never users of the hormones and the current users of the hormones and past users and, you know, they have the number of ovarian cancer cases in each particular group and the person years. You guys have talked about person years, person time. Okay. Good. In each particular group. And then they can calculate rate ratios. You can see here they found a little bit of an increase, like a 24 percent increase, but it wasn't statistically significant. Okay. So prospective cohort study. Okay. Everybody get that? Cohort. Follow them forward over time. All right. Retrospective cohort study. That's what I'm supposed to talk about. Retrospective cohort study, you're starting at the same point in time. You're starting 2012. But instead of getting a cohort in 2012 and following them forward to 2020 what you're doing is you're actually physically starting the research in 2012. But you go back in time to get your cohort. In other words, you say, okay, give me people that were in 2007 a group that was exposed and a group that was unexposed. And the classic example. You see a lot of retrospective cohort studies in occupational medicine. The classic example. You are standing here in 2012, you are worried about let's say does benzene cause leukemia. So in 2012 you go to Acme Benzene Factory and you say Acme Benzene Factory, give me a list of all your workers that were working in your factory five years ago, 2007, all those workers. Now in Acme Benzene Factory some people there are exposed to benzene and some people, the administrators, the executives, they don't work with benzene. So some people in Acme Benzene Factory are exposed and some are unexposed, but the bottom line is you are starting your research in 2012, but you are defining your cohort as a cohort that existed in the past. Okay. Yes. >>>: (Inaudible). Craig Steinmaus: Now, yes. Yes. Well, have you guys had your lecture on case control studies? Okay. So I'm only going to give you 15 seconds of case control. Case control studies you start -- okay. Cohort studies think about exposure. You're getting a group of people that are exposed and unexposed. And then you're following them to see who gets the disease and who doesn't. Case control studies you start with disease. You get your cases. Then you get your controls. And then see who is exposed and who is unexposed. So the difference between the two is where's your starting point? Is it exposed/unexposed or is it cases and non-cases? That's the difference. Okay. Let's see. Okay. Yeah. Okay. So we're going back in time. So that's the different. Prospective, start here go into 2020. Retrospective I'm starting here but I'm getting a cohort, the start of the cohort started in the past. And then my next step, once I define the cohort that existed in the past then I follow them, it's over time, but it's not forward over time. I follow them until today to see who got the disease and who didn't. So you're essentially pretending that you're putting yourself back into 2007. You're putting yourself back in the past and you're pretending you're following people forward over time, but you're not because you're starting here. Okay. Does everybody get that? That's the major difference. Are you moving forward or are you doing everything that happened in the past? Okay. So this basically just says what I just said. Prospective, again. Disease free, benzene exposed, benzene unexposed and then follow them forward over time. Let's do this. Follow them forward over the next five years, versus retrospective that court, the start of the cohort existed in the past and you're looking at things that happened in the past. It's the same design. You still following, you can still follow the rates of disease. It's just are you starting here going here or starting here and going there? Okay? All right. Now let's give you a couple of examples. Um, I'm going to let you know which examples I stole from Dr. Reingold and which ones I didn't. So that -- and which slides I stole from him and which ones you didn't so you can focus on those slides that I took from him. Right? Because those are most likely the ones that are going to be on the test I assume. All right. So this I stole from Art. Um, but I should have picked it up myself because this my area of expertise, arsenic. But this is a retrospective cohort study of lung cancer, respiratory cancer, lung cancer, cancer basically in copper smelter workers. And the big exposure in copper smelter workers is when you smelt any sort of metal, especially copper. Well really copper, there's a lot of arsenic that's in the same rock as the copper. And so you get heavily exposed to arsenic. So really when they say copper smelter workers they're talking about arsenic exposure. So, okay. So let's show this study design. All right. So what they did Jay Lubin at NCI, when you see an article by this guy, pay attention. He's a really smart guy. So Jay Lubin at NCI in 1989 he physically started this study. He said, well, let's look at arsenic in copper smelters and the risk of lung cancer. So he physically started this study in 1989 when he started his research. And what he did in 1989 was he said, you know, he went to these guys, Anaconda Copper Mining Company. They're a copper smelter in Montana. And he said, you know, Anaconda Copper Mining Company can you give me a list of everybody that worked in your facility from 1938 to 1957. All right. Before I had shown you where, you're standing here in 2012 and you back to 2007 and I said all the workers that worked in 2007. It doesn't have to be one particular year that the cohort starts. It can be over a period of time. But the bottom line is when the cohort starts it's in the past. The cohort starts in the past. So it can either start in 1938 or it can start any worker that worked from 1938 to 1957. All these workers started working in the past. So sometimes it's a period in the past like that one. So they said, you mean every worker that worked if your company for at least 12 months. They didn't want people that worked for a date, hated it and left. So all workers that worked there for at least some period of time between these two years. And based on this information they also had records of what areas in the company that these workers worked in. Did they work in an area that had a lot of arsenic exposure? You know, the guys that were actually smelting the copper. Or were they workers that worked maybe in the office or something like that. Or some other part of the smelter that wasn't so heavily exposed. So all the workers that started working during this period, some were exposed. Some were unexposed. And then they followed them forward over time until again, he's standing here today and he says, okay, how many of these people that started working here got lung cancer in this intervening period? Again, all in the past. Okay? Make sense? All right. Who got the disease and who didn't? Up until today. All right. And here's their results. And I'm not sure why they did this. You know, sometimes people do things that you don't quite understand. Maybe they had a good reason for it. But what they did was they looked at their exposure like this. They actually divided people into people that worked with only a little bit of arsenic, light or unknown arsenic exposure areas. People that worked in medium arsenic exposure areas and people that worked in heavy arsenic exposure areas and then they divided them into the number of years that they worked in those areas. So let's look at these results because I think these are the most relevant. So these are people that actually didn't work. They took anybody that didn't work in a heavy exposed area. 0 years in a heavy exposed area. They still worked at the factory but they didn't spend any time in the heavy exposure years. And then groups of people that worked increasing number of years. And you can sort of see, you know, just like in any cohort study. Right? You have your person years. Cohort studies you're looking at incidence rates and what's an incidence rate? It's your number of cases divided by your person years for your exposed group divided by your unexposed group. And so they had the person years. They have the number of cases in each particular group. And you can see as the number of years goes up the relative risk or they should call it a rate ratio. That's where it is. The rate ratio goes up. Okay? So that's pretty good evidence that increasing arsenic exposure at least the longer you work in these heavy arsenic areas increases the risk. But the bottom line retrospective. He was standing here in 1989, everything that happened with these people happened in the past. Question. >>>: (Inaudible). Craig Steinmaus: Absolutely. And that's, these occupational studies, whatever study that you do -- my background is occupational medicine. I worked in a clinic, I was clinician -- physician for 7 or 8 years in occupational medicine clinic down in Silicon Valley. We had a lot of the big corporations down there. You know, Intel, Hewlett-Packard all those guys. We'd see all their injured workers. But -- so I've got a lot of experience with occupational medicine. Companies don't want researchers coming into their companies to look at health effects. Right? Because if you find something what they're really afraid of is if you find something that isn't really there, you make a mistake. The workers get really worried about it. They end up suing you. Right? Companies would not rather risk that. So it's actually pretty hard. Now it happens a lot where you are able to get access to a company. But it's pretty hard to do research in a company like this. So, now, every once in a while people are able to do it. So you can do it. But again, access is very hard to companies. Okay. Now another thing that I wanted to point out about retrospective cohort studies is, um, a lot of times, and in fact I would say maybe even most of the time retrospective cohort studies they don't do rate ratios. They don't do the number of cases divided by the person years. What they do is standardized mortality ratios. Have you guys talked about standardized mortality ratios? Okay. So then you know or you should for the test -- no, I don't know what's on the test. Right? I'm just making it up. You should -- standardized mortality ratio is where you take a cohort of exposed people like let's say Acme Benzene Factory and you just take those benzene exposed workers around you see their rate of disease but you don't compare it to unexposed workers. In an SMR you actually compare it to the rates in the entire United States. So the entire United States is like your unexposed group. Now sure there's some exposed people in the entire United States it's not really unexposed, but there's such a small fraction that United States is essentially unexposed to at least high exposures to benzene. So that's an SMR. You're not taking the unexposed group from the factory itself, from the cohort itself. It's the entire United States that's your unexposed comparison group. So you will see that, I, you know, for this lecture I was trying to find some good examples of retrospective cohort studies. And I would say, you know, 95 percent of the time it seemed like researchers were presenting their results as standardized mortality ratios. So it's not always a rate ratio. But again, bottom line is retrospective it's always in the past what happened always happened always happened in the past. Okay. Any questions about that? All right. Um, oh, can you guys see that? >>>: Yes. >>>: (Inaudible). Craig Steinmaus: Oh, you didn't? Okay. Let me give you my quick two-minute version. Yes. >>>: (Inaudible). Craig Steinmaus: Yeah. So let's say here. Like in this example, let me compare it to this example. In this example we have an unexposed group. 0 years in the high arsenic exposure group. Okay. So that's your reference group. Right? You know it's a reference group because they give you relative risk of one and no confidence interval. So instead of using these people as your unexposed reference group you'd use the rates of lung cancer in the entire United States. They would be your unexposed reference group instead of this group. Does that make sense? All right. So we'll go quickly through John Snow. So you guys probably know then that there was this big cholera epidemic in 19 -- sorry, 1853, when I first wrote up these slides everything was 1953, it's like I couldn't get myself back a hundred years or so. But 1953, all right there's a big cholera epidemic -- 1854, sorry. You know, he looked, so at the cholera epidemic had already started. So things had already happened in the past. So, in 1854, you know, he looked at the water supply in London and saw that there were two major water suppliers, the Southwark, I don't know how to pronounce that, Vauxhall maybe. Southwark and Vauxhall Water Company and Lambeth Water Company. One got water from a, um, oops I'm sorry. This is supposed to be, I made a mistake here. This is supposed to be clean. They drew water from the Thames River at a point that wasn't heavily polluted. Sorry. Everybody write that down. Wasn't heavily polluted. And then a second water company, where it was, they drew water from the Thames River at a spot that was heavily polluted. So was heavily polluted; was not heavily polluted with sewage. And what he did was he went to every house and saw which house, houses got water from this company, which water got water from this company. And then which houses had a case of cholera and which houses didn't. And there's your results. Right? I won't spend much time on this. Okay. So the company with the polluted water, you know, more cases of cholera than the company without. Okay. Simple. All right. Okay. I'm not going to go over this particular example. Maybe I will if we have time at the end. But we're going to put our slides on, you guys have bspace. We'll put these slides on bspace. I would recommend everybody just take a look at this example. It's a pretty good retrospective cohort study. Um, does, um -- this was about does having an abortion increase the risk of breast cancer. So this is one of Art's examples. So I would go on bspace, get my slides, check this out. And maybe we'll go over it if we have time in the end. This is another example. I don't want to kill you with examples. Okay. So, again, I think you sort of see the basic difference between prospective and a retrospective study. And what I'm going to talk about now is some general principles of study design and these principles actually apply to both. Okay. Um, and then I'm going to talk about well, what's the advantages. When would you want to do this and when would you want to do that? So, just some general principles. And this is kind of a general principle in, um, study, in pretty much all study designs. And this is a point that we're dealing with now. You know, again, we do studies on arsenic in drinking water. And whether they cause cancer or other health effects. And what we're dealing with now is we do our studies in areas that have a lot of arsenic in their drinking water. It's naturally occurring. In fact there's hundreds of millions of people, maybe even a billion people, well, let's say hundreds of millions of people throughout the world that have naturally occurring arsenic in their drinking water. A lot of these people have very high levels and that's what we do. We go and we study these people. There are people now that are studying these very low levels of arsenic. Very low. And they're getting a lot of money for their grants. And they're getting a lot of money that we think should go to our grants and we think they're wasting their time. And let me show you why I think they're wasting their time. Is that in general, if you're trying to find out whether an exposure causes a disease, you kind of want to go and get some people that have high exposure. Not people that have really low exposure as your exposure group. In other words, let me give you an example. Let's say you want to -- let's say we don't know that cigarette smoking causes lung cancer. And we want to do a study. You could do a study where you have an unexposed group and they smoke 0 cigarettes per, oops, 0 cigarettes per year. And you could get an -- you could compare them to smokers. And let's say you did this. You had an exposed group where they smoked one cigarette per year. Okay. That's not a very high exposure. There may be some increased cancer risk associated with that, but it's going to be so small you'd never see it. You'd probably get a relative risk like this. 1.0001. Something like that. And when you have relative risks that are this small, this small of an increase. You never know if that's true. That could very easily be due to chance or it could be very easily due to confounding. Maybe you just had one person in this group that just by chance he happened to be exposed to a lot of silica and he got cancer because of silica. Or whatever. Just some sort of chance or confounding or bias can cause something like this. So, if you want to figure out whether smoking causes lung cancer, you don't look at a group like this. You look at a group like this. That they smoke maybe oh, greater than 20 cigarettes per day. Right? Because if you do that you'll get a relative risk of ten. And it's much -- if you guys can see that. Yeah. Except you guys in the front row. It's much easier to tell that this is real. This is really an increase. There's really an association. Much easier to tell here than there. So when you're going a study and you're trying to find a cause and effect relationship, a real association, it's usually best to include a higher exposure group. That applies to any study designs and it applies to retrospective cohort studies, prospective cohort studies. You want some people that have high exposure. Yes. >>>: (Inaudible) decides what parts per million (Inaudible). Craig Steinmaus: Right. Did everybody hear that question? What if you already know that smoking causes lung cancer? Well let's say a different chemical. Let's say PM ten. Particulate matter ten, air pollution. You already know air pollution increases the risk of asthma. But you want to know, you know, if you get a lot of air pollution you got, you're going to get asthma. But what about these lower or moderate levels of air pollution? Are those unhealthy? So that's an exception to kind of what I just said. Sometimes you're not looking at whether an exposure is associated with the disease. You've already figured that out. Now you want to figure out, well, what level of exposure causes what level of a disease. And in those instances, yeah, sometimes you want to get into those middle exposure groups. Okay. But what I'm saying is we're running into this in our arsenic exposure study. Somebody has just submitted a prospective -- actually no, it's a retrospective cohort study on arsenic in drinking water and the risk of diabetes. And the exposure is so low. They did that, they did that one cigarette per year essentially. That's about their level of arsenic exposure. It's so low that they're never going to be able to find anything. So, um, so, okay. That's that. So, again, if you're looking at does this cause this, you usually try to get a high exposure group. If you're looking at, again, and I'm not saying just high exposure and just no exposure. What's really the best is to have high, medium, low, none. Because then you can look at a dose response relationship. Does the relative risk go up? But what I'm saying is don't forget that high exposure group if your looking at does this cause this. Again, exceptions. Sometimes you really are just interested in the low exposure. In that case, yeah, just study low exposure. But if you're interested in does this, does chemical X cause disease B. Have a high exposure group. Again, you know, a range of exposures is nice, but some people should be up here. Okay. This just says what I just said. You know, again, if you're looking at does meat increase the risk of colon cancer. You wouldn't want to look at people that eat six servings a day compared to people that eat five. Right? That difference is just too small. You want to have more of a contrast of exposure. Ten servings of meat a week versus 1 or 2 and then you can really see a real effect. Again, it would be nice to have a range of exposures, but you'd want to include a high and a low. Question. >>>: What about (Inaudible)? Craig Steinmaus: So, if you were going to do a study of secondhand smoke and I've done some work on this, secondhand smoke and the risk of breast cancer. If you were going to do that study, where would you do it? Would you do it in you guys? Or would you do it in maybe people that work in bars? Or people that are married to a smoker? Those studies that looked at the health effects of secondhand smoke, they went to this, the people that had the highest secondhand smoke exposure. Again, the people that worked in the bars, the people that were married to a smoker. They looked at them. They didn't look at like you guys. Who you have some exposure to secondhand smoke. But in general I would guess for most of you it's pretty low. Okay. All right. The other basic principle of study design is, you know, you have an exposure group and an unexposed group. They should be as similar as possible in all regards in everything except for this exposure of interest. Otherwise they should be pretty much the same. Now if the exposure causes the disease they may be difference than the rates of disease, but other than the exposure and the disease they should be as similar as possible. And that applies again, prospective cohort studies, retrospective cohort studies. All right. So this is an example of a study that we're doing where we're looking at past arsenic exposure. And you can see we actually included two separate towns in northern Chile. We had one town that in the past they had very high arsenic exposure. And we included another town that didn't. But look at all these other variables. In terms of all these other variables, you know, the type of the town, the percentage of males and females, the age distribution, education, rates of smoking. Everything else is very similar. So we have an exposed town and unexposed town, but other than that everything is similar. That's what you want in a cohort study, retrospective or prospective. All right. What you don't want, let's say we were doing this same study of arsenic in water. All right. And we had an area with arsenic in water and an area without arsenic in water. And let's say we were doing does arsenic cause bladder cancer? You wouldn't want a situation like this where you had a big -- benzidine also causes bladder cancer. You wouldn't want a big benzidine factory in your arsenic area. Because then if you found increased rates of disease here you wouldn't know whether it was due to the arsenic or due to the benzidine factory. So you don't want major differences between the two. You want to be as similar as possible. Now there can be some differences. Like you could have 60 percent smokers here and 50 percent smokers here. Those sorts of minor differences you can adjust in your statistical analysis. But you wouldn't want a hundred percent smokers here, 0 percent smokers here. This general principle cohort studies, retrospective and prospective as similar as possible. Okay. So, and, again, another aspect of both cohort study designs is you are following people over your period of time. You're taking a cohort and you're following over a period of time to see who gets the disease and who doesn't get the disease. So, just some ways that people have followed people. It's hard to follow -- these studies can be very difficult. Especially like here in the United States because people move all the time. I think the last statistic I saw was about 50 percent of the people in the United States move every five years. So if you are following people for five years, you know, at least half your cohort has moved. If you are following people for 20 years that's a lot of people that have moved to a different residence and you got to find them. And there's really no good database here in the United States to find people. In Chile we have an election registry we can use to find people. Um, and in other countries they have similar sort of things. They have town registries, that sort of stuff. Um, you know, but other ways. You can look at this list. Other ways of following people, death records, marriage certificates, driver's license, sometimes medical records. In Chile sometimes people aren't on the electoral registry so we have to go to their -- if they lived in a house that we know of we go there. If they're not living there we ask the neighbors or we try to find relatives and say hey, where did George move? So you're following people over a period of time to see if they got the disease or not. So you've got to find them every time you're looking whether they got the disease you've got to find where they are. So some ways of following them. All right. And keep in mind that, you know, if you start with a cohort of a thousand people and you're following them for ten years you don't want to just follow, you don't want to lose 90 percent of them. You don't want to end up 100 people. Because if you go from a thousand people at the start and then at the end you can only find a hundred, you've lost 90 percent. If you get a result, you don't know what that result really means. What does it mean to all these thousand people? It only represents these hundred. And do these hundred, what happened in these hundred the same as what happened in the thousand? So you won't necessarily know that. So you try to make sure your follow rates are good. If you start with a thousand you'd really like to get a thousand at the end. Make sure you know what happened to those thousand. Now that rarely happens. You're always going to lose people. But you try to keep that response rate up over a reasonable level. Usually people like 70 or 80 percent, something like that. Depending on how long your cohort goes. I think the nurses health study, if I remember right, they usually do a pretty good job. They usually can follow 90 percent of the women. Okay. And this is just another example of all the work that this particular study went through and you can look at this later when we put it on bspace, all these little things. All these different things they had to do to follow people. It's a lot of work. A lot of work. Okay. So you're going to talk about several different studies. I guess I'm talking about cross sectional studies on Wednesday. And you'll be talking about case control studies and maybe a few other ecological studies, those sorts of things. So these are the advantages of both of cohort studies, prospective and retrospective compared to some of the other study designs. The first thing is they're good for rare exposures. You'll see in some of the other designs you talk about, they're not necessarily good for rare exposures. Cohort studies are. Why? Because when you figure out, okay, what cohort you're going to study you can go to a group of people that have some exposure. If you're interested in benzene, you don't have to go to everybody that's in Oakland, California. You can go to that benzene factory and establish your cohort in that benzene factory. So you can go to a place that has the exposure. So if your exposure is rare, you go to that place where it is. So that's a good thing about cohort studies. You'll see in some of the other designs it's not so good. The other good thing about cohort studies is, and I'll show you this in the next lecture on cross sectional studies, but in cohort studies you can establish a temporal relationship. What that means, does the exposure come before the disease? And again, in cross sectional studies I'll show you how that can be complicated sometimes. And the other thing about cohort studies is you can look at a bunch of different outcomes. You have your exposed and unexposed. You're going to follow them over a period of time. You don't just have to look at lung cancer or ovarian cancer. You can look at both. Or diabetes or anything you want to. Any outcomes you want to. You can look at multiple outcomes as opposed to case control studies as you'll see. Really just one outcome. You're just getting your cases of a particular disease. Okay. And you can determine disease rates. You get person time data. That wasn't on Art's slides so I put it on mine. All right. But they're not good for rare diseases. Cohort studies are not good for rare diseases. The reason is, let's say you have a rare disease. One in 10000. Say you have a disease that happens to one in 10000 people. Every ten years every one in 10000 gets it. If you have a cohort of 10000 people you're going to spend all that time following them for ten years, you only get 1 or 2 cases. And I think you can see any study with only 1 or 2 people with the disease of interest isn't all that helpful. Okay. So rare disease you have to have a huge cohort in order to get enough cases. So it's not good for very rare diseases. And they're not good for studies with long latency periods. In other words, if -- like arsenic. If you're exposed to arsenic you don't get disease the next day. You get disease 20 years. So if you're doing a prospective cohort study on arsenic, you actually have to follow a cohort for 20 years. So, um, that can be very expensive. As you can imagine. Very expensive. Okay. That's just, okay. Now let's talk about -- that was cohort studies in general. Retrospective cohort studies, they're usually cheaper than prospective. Right? Because you're standing here and everything happened in the past. You don't have to follow people for 10 or 20 years into the future. You don't have to spend all that time and have all those researchers doing all that work. Everything happened in the past. So you just go back and get all those old records. That's not nearly as much work as every year you see if somebody has a disease over 20 years, following people for 20 years. That can get very expensive. But the problem with retrospective cohort studies. Again, you're basing everything on past records is the availability of those records. For most ideas and study designs and hypotheses these records aren't available. So that's why you won't see a whole lot of retrospective cohort studies. Because the records just aren't there. That's the major disadvantage. Okay. And you may not -- maybe you're worried about smoking as a confounder. You may not know whether people smoked in the past or not. Those records may not be available. Okay. That's it. Any questions about retrospective cohort studies? Yes. >>>: (Inaudible). Craig Steinmaus: No it's essentially the same. Yeah. It's essentially the same. It all depends are you really following people forward over time or are you just pretending to? But other than that you're still looking at the rate ratio is your usual outcome metric. So, in both. So. Any other? >>>: How does this generalize to the rest of the population (Inaudible) what are the odds (Inaudible). Craig Steinmaus: Well really what you're doing like that benzene example I gave you. You're seeing not whether you or me has to worry about benzene. You're seeing whether those people that work in benzene factories should they still be exposed to those high levels of benzene or should we set regulations? So they're generalizable to other people that are working in the benzene factories. So, yeah. Okay? Any other questions? Okay. Free to go (laughter). Cross sectional studies on Wednesday. (Applause) Thank you.