While Goddard introduced Binet's test to the United States, it was Lewis Terman who ensured its lasting popularity. At the same time that Binet and Simon were developing the first version of their scale in France, Terman was working on his doctoral thesis at Clark University in Massachusetts. A former teacher, Terman had noted that some students seemed to sail through all of their classes, while other students always struggled. He wanted to find mental tests that would distinguish one group of students from the other. To do this, he gave a series of tests to 14 schoolboys—seven of whom had been singled out by their teachers as exceptionally bright, and seven of whom had been singled out as exceptionally dull. Although Terman was still unaware of Binet's work, the tests he chose were more similar to those of Binet than to those of Galton or Cattell. The tests involved creative imagination, logic, mathematical ability, language mastery, interpretation of fables, the game of chess, memory, and motor skill.
As Terman had expected, the bright boys did better, on average, than the dull boys on all the tests except those for motor skill. There was some overlap, however. On most of the tests, the best of the "dull" boys outdid the worst of the "bright" boys. As a result, Terman was disappointed by his findings. Yet the results only seemed like a failure because Terman had downplayed a key factor: The dull boys were almost a full year older, on average, than the bright ones. Had the two groups been the same age, the differences in their performance would have been greater. At the time, however, Binet had not yet pointed out the critical need for age standards in intelligence testing. Terman had failed to appreciate just how important age was.
In 1910, Terman accepted a teaching position at Stanford University. Around this time, he also learned about the Binet-Simon Scale. He immediately saw the advantage of using age standards. When age was taken into account, both his test items and those on the Binet-Simon Scale did a relatively good job of predicting school success. However, Terman also saw that the Binet-Simon Scale needed to be adapted for a U.S. audience. Terman showed that, in its original form, the Binet test seriously overestimated intelligence in young American children, but underestimated it in older children. Clearly, some of the test items and scoring needed to be adjusted.
Terman set out to assess Binet's test items on a large number of American children. Several new items, some of which were based on Terman's doctoral research, were assessed as well. Since Terman used better methods for choosing children on whom to try out the test, his results were more accurate than those of Binet. In 1916, Terman published his Stanford Revision and Extension of the Binet-Simon Scale, an unwieldy name that was quickly shortened to Stanford-Binet. The new test was more than a mere translation of the Binet-Simon Scale, however—it was a big leap forward. Forty new test items had been added, and some of the less reliable original items had been dropped. In addition, Terman had borrowed Stern's idea of expressing results on the test as an IQ score.
The Stanford-Binet was an advance in other ways as well. For example, it was the first published intelligence test to include very specific, detailed instructions on test giving and scoring. It also offered alternate items to be used under certain circumstances; for example, if the examiner made a mistake when giving the regular item.
The Stanford-Binet quickly became the best intelligence test in the world and the gold standard by which future tests would be judged. It included six tasks at each age level. Following are two examples.
• Age four: Saying which of two horizontal lines is longer; matching shapes; counting four pennies; copying a square; repeating a string of four numbers; answering a question such as: "What must you do when you are sleepy?"
• Age nine: Knowing the current day of the week and year; arranging five weights from heaviest to lightest; doing mental arithmetic; repeating a string of four numbers backward; producing a sentence using three specified words; finding rhymes.
In 1926, Terman began working on a revision of the test with his colleague Maude Merrill. The project took them 11 years to complete. The 1937 revision offered two equivalent forms of the test. It also added new types of tasks for preschool and adult test takers.
Another revision of the test was already well under way at the time of Terman's death in 1956. Published in 1960, this third edition of the Stanford-Binet offered only one form of the test, composed of the best items from the two earlier forms. No new items were added. There was one big change, however: the introduction of a new way of calculating IQ. No longer was it simply a matter of dividing mental age by chronological age, then multiplying by 100. Instead, a deviation IQ was used. The deviation IQ was based on a comparison of the performance of an individual with the performance of a group of same-aged people during the test's development phase. Test performance was converted to a score where the average was always 100, and the standard deviation, a measure of variance in the scores, was 16. In the current version of the Stanford-Binet, the standard deviation is 15, but the average is still 100.
To understand how this works, it helps to picture the range of scores fitting neatly into a bell-shaped curve. About two-thirds of all scores fall between the average (at the top of the bell) and one standard deviation on either side. In other words, about two-thirds of all people have IQ scores between 85 and 115. Ninety-five percent of all scores fall between the average and two standard deviations on either side. In other words, only 5% of all people have IQ scores lower than 70 or higher than 130. This type of test, with an average of 100 and a standard deviation of 15, has become the industry standard in intelligence testing.
Was this article helpful?