Psych - W1_lecture24

Lecture 24: Intelligence. People aren't stupid, but even casual observation suggests that some people are smarter than others. The late Richard Herrnstein, longtime professor of psychology at Harvard, called intelligence testing psychology's most telling contribution to date. While that statement may be a little too extreme, it's very clear that the notion of intelligence, and the development of tests to measure intelligence, has been a tremendous growth industry in psychology, and it's had a tremendous amount of societal impact. The story begins in the 19th century, with Sir Francis Galton, a British mathematician who was a cousin of Charles Darwin's. Like Darwin, Galton was interested in evolution and he was impressed by his observation that success tended to run in families -- a point which he tried to document in his book Hereditary Genius, written and published just around the time that Darwin was publishing the Origin of Species. For Galton, the implications were clear. There were natural variations in intelligence, just like there were natural variations in various physical characteristics. High intelligence was adaptive and, therefore, subject to the pressures of natural selection -- just like the beaks on Darwin's finches. Galton assumed that individual differences in intellectual ability were inherited somehow -- he didn't know how, nobody knew anything about genes at this point in history -- and in that way passed down through families. He attempted to measure these individual differences -- developing a field that he called anthropometrics, or the measurement of man. And he and his student Charles Spearman developed new statistics, like the correlation coefficient, that permitted them to look at the relation between individual differences on one trait and individual differences on another. Galton's commitment to the idea that individual differences in intelligence were inherited led him to found the eugenics movement in Great Britain, promoting selective breeding among highly intelligent people -- and for that matter, the prohibition of breeding among unintelligent people -- as a way of improving the intellectual stock of Great Britain. Across the English Channel, another source of interest in intelligence testing arose in France with the work of Alfred Binet. A certain amount of schooling was mandatory for French children, and the French Minister of Public Instruction wanted to make sure that children were not denied schooling simply on the basis of somebody's impression that they couldn't profit from it. So he commissioned Alfred Binet, who was by then a prominent psychologist in France, to devise a test that could put this matter on an empirical basis -- a test that could distinguish children of normal or superior intelligence, who could profit from schooling, from those of sub-normal intelligence who couldn't profit -- or who perhaps needed special education. Binet worked on this project with another French psychologist, Theodore Simon, and in 1905 produced the Binet-Simon test of intelligence -- which was a forerunner of all the intelligence tests in use today. In contrast to Galton, Binet and the French authorities didn't assume that intelligence was inherited and passed down through families. They didn't actually care about this issue too much. What they really cared about was that children should not be unfairly denied the opportunity to profit from schooling. So the Binet-Simon test was intended to put selection for education on a completely objective basis. Galton actually set up an anthropometric laboratory at the South Kensington Museum in London in the 1880's to collect empirical data on individual differences in various abilities. And he focused on relatively simple physical and mental abilities such as reaction time, strength of hand grip, acuity of vision and hearing. Binet and Simon were much more practical. Their test, which they called a scale for measuring intelligence, was actually intended to be a work-sample of the kinds of things that were actually required of young schoolchildren in the French schools. The original test had some 30 items. More were added later, but some samples are listed here: following a moving object with the eyes, or finding a square of chocolate wrapped in paper, telling how two common objects were different or similar. Repeating a sentence, making rhymes, repeating spoken digits (like the short-term memory test that we discussed earlier), taking three nouns and putting them together so as to form a meaningful sentence, or defining abstract terms. So whereas Galton's approach was more abstract, Binet and Simon's test was specifically oriented towards the intellectual requirements of the French primary school. But how do you score such a test? This is supposed to be a test for measuring intelligence, but how to you quantify performance? Originally, Binet and Simon simply arranged the items of their test in increasing order of difficulty, based on pilot testing. The child would pass some tests and then start to fail. Children who started to fail later in the testing sequence then were deemed more intelligent than children who failed earlier. That's okay, but where's the cut point? Where's the threshold that determines whether the child is appropriate for regular schooling or perhaps needs special education? So a couple of years later, Binet and Simon did something different. They grouped their items into clusters according to the age-level of children who routinely passed them --so very young children might be able to follow a moving object with their eyes and they might be able to find and eat a square of chocolate wrapped in paper, but they might not be able to repeat a sentence fifteen words in length or tell how two common objects are different. Somewhat older children might be able to tell how two common objects are similar or different, might be able to make rhymes, but not be able to use three nouns in a sentence, or define abstract terms. In this way, Binet and Simon developed norms for test performance for each age level, from age three to age thirteen, and then they compared the child's performance to children in various age groups. If the child passed tests that were also passed by a majority of 4- or 5-year-olds, but, but failed tests that were passed by a majority of 6- or 7-year-olds, the child would be given a mental age of 5. If the child passed tests that were also passed by a majority of 9-year-olds, but not by a majority of 10-year-olds, then the child would be given a mental age of 9. So if schooling started in France when children were 5 years old, a child with a mental age of 5 was deemed ready for school. A child with a mental age less than 5 was deemed not to be ready for school, and maybe in need of special educational services. The Binet-Simon test caught on like wildfire, and served as the basis for the development of similar mental tests here in the United States. Very quickly, Lewis Terman, a professor of psychology at Stanford University, translated the Binet-Simon scale into English, and published it as the Stanford-Binet intelligence scale -- a test that, with modifications, is still in use today. Terman also modified the use of mental age as the index of intelligence. He divided the child's mental age by the child's chronological age and then multiplied that by 100 to yield what he called the intelligence quotient, or IQ score. Thus, a child who is 5 years old and had a mental age of 5 had an IQ of 100. A child who is 5 years old, but had a mental age of 4, was given an IQ of 80. A 5-year-old with the mental age of 6 was given an IQ of 120. Another psychologist, Robert Yerkes, adapted the Binet scales for use in personnel selection in the Army, in World War I, when people were subject to the draft. And the intelligence tests were part of the criteria used to determine who was eligible for service and who was not. There were two of these tests: the Army Alpha test for recruits who were literate, who could read, and the Army Beta test for recruits who were illiterate. There was a lot of illiteracy in the United States in the early part of the 20th century. Whereas the Binet-Simon scale and the Stanford-Binet scales were intended to be administered individually, the Army Alpha and Beta tests were designed to be administered in a group situation so as to inject economies into the testing process. A version of the Army alpha test is still used by the American armed forces. It's called the Armed Forces Qualification Test, and every new recruit takes this within a few days of enlisting. Finally, David Wechsler, another American psychologist, devised a variant on the Stanford-Binet intelligence scale for testing purposes. It's called the Wechsler Adult Intelligence Scale; there's also a children's version called the Wechsler Intelligence Scale for Children. And this remain the gold standard for intelligence testing in the United States and indeed throughout the world. The early version of the Binet-Simon scale had 30 items. The later version had almost 60. By contrast, the Wechsler Adult Intelligence Scale had 11 sub-scales that were further divided up into verbal and performance scales. In the Information test subjects are asked questions about general information, like what's the height of the average American woman. In the Comprehension test, subjects were asked questions like, what would you do if you found a letter that was sealed, stamped, and addressed on the street? In the Memory Span test, they were asked to repeat a series of up to eight digits forward and a series of up to six digits backwards. Arithmetical Reasoning asked the subject to perform simple arithmetic tasks of addition, subtraction, multiplication, and division. In Similarities, the subjects had to say what two objects had in common, like an orange and a banana. Then there was a Vocabulary test in which, of course, the subjects were asked to define various words. The performance scales didn't require so much by way of verbal ability. In the PIcture Arrangement test, subjects were presented with cartoon strips that had been cut into individual panels, and they were asked to re-arrange the panels in order to make a coherent story. In the Picture Completion test, they were shown a line drawing that had a part missing, and they were asked to supply the missing part from a number of choices. In the Block Design test they were asked to produce a particular design from a set of colored blocks. Object Assembly was like a jigsaw puzzle. And in the Digit-Symbol test, the subjects were required to associate symbols, like a star or a cross, with digits like 3 and 6. Wechsler also introduced a new way of calculating IQ, known as the deviation IQ. Much as Binet and Simon had done, Wechsler administered his new scale to individuals from different age groups. Of course, Binet and Simon were interested in young children, so they tested young children. But this is the Wechsler Adult Intelligence Scale, and so the subjects for this testing were between 16 and 75 years of age. Wechsler calculated the mean score for each age group on his test, as well as the standard deviation. He then employed a procedure known as the Z-score to translate these means and standard deviations. Therefore, by comparing the scores of each subject to the mean and standard deviation of his or her own age group, Wechsler was able to calculate an IQ in terms of the deviation of the subject's score from the mean of his or her own age group. Here's an example of how it worked. Suppose there's a test, and the average score on the test is 40 with a standard deviation of 12, and what we want to do is to transform this score so that it's got a mean of 100 and a standard deviation of 15. If the subject had a test score of 40 -- well, that's exactly the mean and that would give us a deviation IQ of 100. But if the subject had a test score of 28, 12 points below the mean, that's 1 standard deviation below the mean, so that subject will have a deviation score of 85. A score of 52, 12 points above the mean, translates into a deviation score of 115. The score of 16, two standard deviations below the mean, would translate into an IQ of 70. A score of 64, two standard deviations above the mean would translate into an IQ of 130. In this way, subjects are compared only to others of their own age group. But everybody's put on a common scale. No matter how old you are, an IQ of 85 means that you scored about a standard deviation below the mean of your age group. And an IQ of 115 means you scored a standard deviation above. An IQ of 100, no matter how old you are, means you have normal intelligence -- that is, normal for your age group. Using this procedure, Wechsler obtained an essentially normal distribution of IQ scores in the population. This is his actual data from 1939 based on the standardization of the original version of the Wechsler adult intelligence scale. And you can see that the histogram, the frequency distribution of IQ in this sample, looks pretty good compared to the ideal, normal distribution, with a so-called bell curve. The most notable departure is at the low end, where there are a few more subjects with very low IQ's than you'd expect on the basis of the normal distribution. To explain this, Wechsler offered a two-factor theory of mental retardation which is still very popular. To some extent, low IQ is going to reflect just the normal distribution. Some people are going to be very high, some people are going to be very low; most people are going to fall in-between. But Wechsler also argued that there were some accidents of pre-natal development, for example, that might produce brain damage, and therefore add to the numbers of people with relatively low IQs. This normal or bell-shaped distribution of IQ scores is exactly what Galton expected back in the 19th century on theoretical grounds. He assumed that mental traits, like intelligence, were distributed normally in the population just like physical traits are, such as height or shoe size. But you should understand that the appearance of the normal curve is completely a product of the way the scores were transformed. Here's an example based on the work of the sociologist Claude Fisher and his colleagues, who examined the performance of young American adults on a version of the IQ test known as the Armed Forces Qualification Test, which is basically a variant on the Army Alpha test I discussed earlier. This test was given to a sample of American young people in 1980. The shaded portion of the figure shows the distribution of raw scores, actual scores on the AFQT. And as you can see, it's skewed, with most of the scores bunching up towards the high end of the scale. Now assume, for the purposes of argument, that the average raw score is 80 -- it doesn't matter what it is, but assume it's 80. And now suppose we want to express each subject's score in terms of deviation units from the mean. And we arbitrarily set the mean to 50 and the standard deviation to say, 20. That subject, who used to have a score of 80, is now going to have a deviation score of 50. And all the other subjects as well are going to have their scores pulled down because they're now going to be expressed in terms of standard deviation units. This statistical transformation produces a nice normal distribution centered on some mean. But the shape of the distribution is somewhat misleading. The bell curve, produced by the statistical transformation, gives us a very different sense of the distribution of abilities than did the original distribution. You'd never know from the shape of the bell curve that so many students did so well on the test. Now there's nothing wrong with this, and there are actually good reasons for doing transformations of this sort. They make the numbers much easier to handle statistically. But the bell-shaped distribution of IQ is not empirical evidence favoring Galton's assumption that mental abilities are normally distributed in the population. The bell-shaped curve is, in some sense, an artifact of the way the data has been handled. Now as I say there's nothing wrong with this kind of maneuver, it's perfectly appropriate, so long as everybody is clear about what's going on. Some students in this course will be familiar with the forced curve sometimes used by college instructors so that the average course grade is set at something like a C or maybe C+ and then the other grades are arrayed around that so that most people will get some kind of C; some fewer people will get Bs; even fewer people will get As. In such a system, even if everybody did really well on a test, they wouldn't all get As. Some would have to get Cs because they were only at the mean of the distribution. By the way we don't do that in this course. Here there is an absolute standard for an "A". If you achieve 90% of the available points, you are going to get some kind of "A", that's a guarantee, and if everybody in the class gets some kind of "A" that's fine with me. Other instructors take different positions and the forced curve is very popular in the natural sciences and mathematics. It's also very popular in law school and business school so watch out when you get there. The various Wechsler intelligence scales, and now the Stanford-Binet as well, all use deviation IQ, so that scores are centered on a mean of 100 with a standard deviation of 15. Many of the college and graduate school entrance exams administered by the College Board also use a forced normal distribution, centered on a mean of 500, and a standard deviation of 100. If you get a score of 500 on your SAT, it means you got an average score. If you got a score of 600, it means your score is one standard deviation above the mean. The law school admissions test, also administered by the college board, uses the same idea, except it's centered on a mean of 150 and a standard deviation of 10. Again there's nothing objectively wrong with this, as long as everybody's clear about what's going on. And what people should be clear about is that, to some extent, the normal distribution of IQ scores is in part an artifact of the way those scores have been calculated. Intelligence tests, like all psychometric instruments must have certain psychometric properties. Standardization refers to the set of rules for administering and scoring the test that guarantee that each individual responds to the same test. There's a right way and a wrong way to administer the Wechsler scale. Norms give us some sense of the average test score in the population, as well as the variation observed around that mean (usually expressed as the standard deviation), and serves as a guide for interpreting individual test scores. Population norms served as the basis for Wechsler's deviation IQ. Reliability has to do with the degree of precision in measurement. This is commonly expressed either in terms of interrater reliability, the agreement between two examiners scoring the same test; or test-retest reliability, which is the agreement between two tests of the same person given on two different occasions. The higher the correlation, the higher the reliability. Validity refers to the idea that the test actually measures what it is supposed to measure and is usually determined by the ability of the test score to predict some external criterion of the trait. For example, intelligence tests might be validated against a criterion of educational outcome, such as grades completed or grade-point average. In addition, one psychometric property is highly desirable, even if it is not strictly necessary. This is utility, or the idea that the test provides an economic advantage over alternative measures of the same trait. Now so far, we've been talking about intelligence as if it were a single unitary mental ability represented by a single score, the IQ score. Now in fact the Wechsler scales calculate two different IQs, a verbal IQ and a performance IQ, and then these are aggregated to produce a total IQ score. But in fact Charles Spearman, a British psychologist, argued vigorously for the idea that intelligence was a single unitary mental entity. Spearman was a statistician, as well as a psychologist, and he worked with Sir Francis Galton to develop the correlation coefficient. Examining the correlations among the host of different mental tests, he determined that they were all positive and relatively high. Because scores on these tests were all highly inter-correlated with each other, he argued that there was a single ability that ran through all of them -- that performance on each individual test was saturated with a single ability that he called general intelligence or g. Spearman's theory was actually a two-factor theory. He argued that g was the most important determinant of test performance, but that there were also factors that were specific to each individual test. No matter how smart you are, if you're clumsy with your hands you're going to do very poorly on block design and object assembly. Spearman's theory is called a two-factor theory because he argued that performance in any particular test was determined by some combination of g and some test-specific s, but he really thought that intelligence was just one big thing. By contrast, L.L. Thurstone, an American psychologist, argued that there were different kinds of tests, different kinds of mental abilities. He looked at the same kind of correlational data that Spearman had, but he used more advanced statistical technique known as factor analysis that wasn't available to Spearman at the time. We don't have to go into the details, but factor analysis essentially groups items together or tests together based on the magnitude of their intercorrelations. Test that are highly inter-correlated cluster together to form a factor. Thurstone observed that some test inter-correlations were much higher than others, and he showed through factor analysis that the various kinds of mental tests clustered together in terms of what he called seven primary mental abilities, having to do with facility with numbers, word fluency, knowledge of word meanings, memory, ability to reason, ability to handle spatial relations, and also perceptual speed, like reaction time. Thurstone agreed with Spearman that all the correlations were positive, so that there was something like general intelligence. But he argued that the general factor of intelligence was relatively weak compared to the specific factors representing these primary mental abilities. The conclusion for Thurstone, then, was that there are really seven different kinds of intelligence. They are linked, but the links are relatively weak. An even more differentiated view of intelligence was offered by J.P. Gulford, another American psychologist. Spearman had argued that there was just one kind of intelligence, Thurstone that there were seven primary mental abilities. Guilford argued that there was a whole host of mental abilities, a very large number of them, which could be classified according to a three-dimensional scheme. This was known as the structure of intellect. The first dimension was content. There were four broad areas of information to which people applied their intellectual abilities. First there was figural content, acquired through seeing and other sensory faculties; then there was symbolic information such as the numbers and the letters of the alphabet; there was semantic information, acquired in terms of words or sentences; and then there was behavioral information acquired through the acts of some other person. The second dimension consisted of operations, or various kinds of intellectual processes. There was evaluation, the ability to judge whether information is accurate or valid. Convergent production was the ability to come up with a single solution to a problem as expressed in rule following or problem solving behavior. Divergent production was the ability to generate multiple solutions to the same problem. Memory obviously, had to do with the ability to encode and retrieve information. And cognition had to do with the ability to understand and comprehend information. These operations were applied to contents to yield products, the outcome of some kind of intellectual processing. These products consisted of units or single items of knowledge; classes, which were groups of units that shared common attributes, like concepts and categories; relations, that linked units as opposites or associations or sequences or analogies or metaphors; systems, comprising networks of multiple related relations; transformations, in which one kind of knowledge was converted into another kind of knowledge; and implications, the ability to make predictions or inferences, or to anticipate the consequences of some item of knowledge. There were four types of content, five possible operations, and six products that yielded 120 different abilities. So for Guilford, intelligence is not one thing, it's not even seven things. It's 120 things and that wasn't the end of the theory. I've just presented the classic version of Guilford's Structure of Intellect Theory, which entails 120 different kinds of intelligence. But Gilford lived for another twenty years. He kept refining his theory and before he died he had gotten up to 180 different intellectual abilities or different kinds of intelligence. Each ability represented a combination of a particular operation applied to a particular content area to generate a particular product. It's all a little overwhelming, but Guilford devoted his entire career to developing tests to measure each of these various abilities. Perhaps the most important contribution of Guilford's theory, however, is his distinction between convergent thinking and divergent thinking. We usually think of intelligence in terms of a person's ability to get the answer to a problem, to get it quickly, and to get it right -- implying that there is just one answer, and it's either right or wrong. But the essence of creativity is to come up with novel solutions to problems. Guilford argued that divergent thinking, creative thinking, is at least as important as convergent thinking, getting the one right answer. And he stimulated a new line of research on the nature of creative thinking and creative problem solving. The last of the classic distinctions among types of intelligence that we want to talk about is the distinction between crystallized and fluid intelligence proposed by Raymond B. Cattell, another American psychometrician. Cattell essentially accepted Spearman's argument that there was just one kind of intelligence, G, that had to do with the individual's ability to perceive relationships in general, relationships of all sorts, between any kinds of things. And he assumed that this ability to perceive relationships, to make connections among things, if you will, had a clear neurological base in brain structure, in the brain's ability to make connections. Fluid intelligence, then, is part of the individual's biological endowment. It's essentially a characteristic of his or her nervous system, of the brain. By contrast, crystallized intelligence has to do with a person's individual experiences: cultural, educational, and environmental. Fluid intelligence is a product of the brain. Crystallized intelligence is a product of experience and it varies with the kinds of educational experiences and other environmental encounters the person has. Let's use a culinary analogy: fluid intelligence is like bread dough, crystallized intelligence is what happens when that bread dough is put into a particular mold and then baked. It's crystallized. It takes on a particular shape, as a result of experiences in the environment. Cattell assumed that performance on any task was basically a product of three different kinds of components: the person's capacity of fluid intelligence, plus education and motivation. Crystallized intelligence, in Cattell's view, was the kind of thing that was assessed by standard intelligence tests. After all, these tests have things like vocabulary subtests and vocabulary is something you acquire with education. Cattell argued that in order to get the proper assessment of fluid intelligence we needed to develop what he called culture-fair tests that would assess a person's ability -- general, content-free ability to perceive relationships, but wouldn't be distorted by the persons particular cultural, or educational, or social experiences. As an example of what a culture-fair test of intelligence might look like, consider Raven's Progressive Matrices Test, introduced in England around the same time as Wechsler was introducing the WAIS. In Raven's Progressive Matrices the subject is presented with a set of figures that illustrate some relationship. And then he has to complete another set expressing exactly the same relationship. So given this sequence of three squares, and three circles, what would the third diamond look like? Raven argued, and I think plausibly, that performance on a test like this is almost completely free of contamination by book-learning. It looks very much like an almost pure test of fluid intelligence. The theories offered by Spearman, Thurstone, Guilford, and Cattell are the classic theories of the structure of intelligence and they remain very influential. A number of modern theorists still embrace Spearman's idea that there's only one kind of intelligence, g. But other contemporary theorists have promoted the idea that intelligence comes in many different forms. Among these more recent theories is the theory of multiple intelligences offered by Howard Gardner. Gardner suggested that there were seven and perhaps more quite different kinds of intelligence, each hypothetically disassociable from the others -- so that somebody could be high on one or two of these kinds of intelligence, but normal or even low on some of the others. There's no general factor of intelligence in Gardner's view. Some of the forms of intelligence postulated by Gardner, such as linguistic, logical, mathematical, and spatial, are the kinds of things that are assessed by standard intelligence tests. Other forms of intelligence are quite novel. So, for example, Gardner has proposed that there is a bodily-kinesthetic form of intelligence that is characteristic of skilled athletes or skilled dancers. Musical intelligence, the kind of skill that's possessed by a prodigy like Mozart or, for that matter, Midori, or, for that matter, Michael Jackson. Intrapersonal intelligence has to do with the person's ability to gain access to his or her own internal emotional life -- to know what he or she wants and believes and thinks and feels. And interpersonal intelligence has to do with the individual's ability to notice and make distinctions among other individuals. Although Gardner's multiple intelligences are individual-difference constructs, in which some people or some groups are assumed to have more of these abilities than others, Gardner does not typically rely on the traditional psychometric procedures -- scale construction, factor analysis, and the like, for documenting individual differences. Instead, he tends to rely on neurological damage, isolation by brain damage, such that one form of intelligence can be selectively impaired, leaving other forms relatively unimpaired. And exceptional cases: Individuals who possess extraordinary levels of ability in one domain against a background of normal or even impaired abilities, in some other domain. For example, Gardner offers Sigmund Freud and Marcel Proust, the French writer, as prodigies in the domain of intrapersonal intelligence because they seem to have had extraordinary access to their own interior mental lives -- the unconscious for Freud and memory for Proust. And he offers Mahatma Gandhi and Lyndon Johnson, the American President, as their counterparts in the domain of interpersonal intelligence -- because both of these individuals displayed high levels of interpersonal and social intelligence, an ability to get along with people, and manipulate people, and so on, against the background of more normal abilities in other domains. Gardner postulates other kinds of signs that suggest different types of intelligence. For example, there are identifiable core operations, coupled with experimental tasks which permit analyses of these core operations. And psychometric tests which reveal individual differences in the ability to perform them. But when push comes to shove, Gardner's really interested mostly in isolation by brain damage and in what we can learn from exceptional individuals. And then there's the triarchic theory of intelligence proposed by Robert Sternberg, another American psychologist. In contrast to many of the other theorists that we've discussed, Sternberg comes out of the tradition of experimental cognitive psychology, as opposed to psychometrics. As its name implies, Sternberg argues that intelligence has three quite different facets to it: an analytical facet, a creative facet, and a practical facet. Analytical intelligence most closely resembles the standard psychometric definition of intelligence as the ability to see relationships and solve problems. It's very closely related to academic problem solving skills. Analytical intelligence includes a set of meta-components, or executive functions, that organize performance and handle learning processes. Then there are performance components. It includes the basic operations that are components of any particular cognitive activity -- encoding information, storing it, retrieving it, making calculations, making comparisons and all that business. And then there are knowledge-acquisition components which really have to do with the ability to learn. But Sternberg expands the traditional psychometric view of intelligence with his idea of creative intelligence, which involves the ability to have insights, to synthesize different kinds of information, and to react to novel situations. In his emphasis on creative intelligence, Sternberg picks up a thread from J.P. Guilford's notion of divergent as opposed to convergent thinking. The third facet of intelligence in Sternberg's theory is practical intelligence, which has less to do with academic problem-solving skills, and more to do with the ability to understand and deal with the problems that we confront in the ordinary course of everyday living. Practical intelligence is the intelligence that allows us to get along in the real world. It has to do with our ability to adapt to the environment that we find ourselves in -- to change the environment in order to better meet our needs, and to move from one environment to another so that our goals can be met. We usually think of intelligence as a cognitive ability having to do with the person's ability to perceive and remember and reason and solve problems. But E.L. Thorndike postulated the existence of a specifically social domain of intelligence, having to do with one's ability to understand and manage other people. His hypothesis was that social intelligence was distinct from academic intelligence -- an idea that remains controversial to this day. Similarly, Peter Salovey and John Mayer have argued for a concept of emotional intelligence, distinct from cognitive intelligence. Emotional intelligence has to do with individual differences in abilities related to the emotional domain of living. Salovey and Mayer defined emotional intelligence as the ability to monitor one's own feelings and those of others; to discriminate among these feelings; and to use information about feelings to guide one's thoughts and action. Again, although the idea is attractive, it's still not entirely clear that emotional intelligence is anything more than a specific application of general analytical intelligence. So there is wide individual differences in intelligence, though, frankly, the practical import of these differences, at least within the normal range of ability, has been a matter of some debate. One thing that's very clear, however, is that for all these individual differences, on the whole we're getting smarter. Credit for this discovery goes to John Flynn, a professor of political science at the University of Otego in New Zealand, and his finding is known now as The Flynn Affect. Recall from our earlier discussion of the bell curve that, by convention, the distribution of IQ scores is typically standardized to a mean of 100. When new revisions of these tests have been published, new norms are also collected and these, too, are standardized to a mean of 100. So, by definition, the average IQ in the population remains constant, at 100. But just as the practice of normalizing obscured the true shape of the distribution of the intelligence test scores in the population, this practice obscures any changes in actual test performance that might have occurred over time. Think about inflation in economics. A dollar bill is a dollar bill, but a dollar in 2009 buys what 11 cents would have bought in 1948, the year I was born. Put another way, a dollar in 1948 would have the buying power of about nine dollars in 2009. To see if there was any inflation or deflation in I.Q. scores, Flynn examined normative studies in which subjects completed both the old and the new form of a test, such as the Stanford-Binet or the WAIS. Then he compared the performance of the new normative group against the old norms. Almost without exception, the result was that the new normative group scored higher on the old test than the original normative group had done, even though the groups were selected according to the same criteria. For example, between 1932 and 1978, white American subjects gained almost 14 IQ points, a rise of almost a whole standard deviation, at a fairly steady rate of approximately 1/3 of an IQ point per year. This population increase in IQ test scores was obscured by the practice of standardizing each new edition of the tests to the artificial mean of 100. The same point can be made by looking at changes in raw scores on a test such as Raven's Progressive Matrices, which are not normalized in the same manner as the Stanford-Benet and the Wechsler. In the Netherlands, the Dutch military routinely examines all 18-year-old men, and the examination includes a relatively difficult version of Raven's progressive matrices. The proportion of Dutch men who scored more than 24 of the 40 items correct increased from 31% in 1952 to 82% in 1982, which translates into an IQ gain of more than 21 points. Similar gains have been observed wherever relevant data has been collected. So if, as the saying goes, intelligence is what intelligence tests measure, the Flynn effect shows that we are indeed getting smarter year by year. The source of this effect is obviously environmental. There's been no widespread eugenics program in any of these societies, such as was envisioned by Sir Francis Galton. It's been suggested that improved nutrition is responsible for the increase, at least in part. After all, people have gotten significantly taller over the same period. Generational changes in education and socioeconomic status have also been implicated. And it's also possible that the increase is due simply to modernization itself. Recent generations have been raised in an environment that is more complex, more stimulating, more informationally rich, more intellectually demanding, than was the case even a couple of generations ago. These environmental changes themselves, the product of human intelligence, have had a reciprocal effect of increasing the intelligence that caused them.