The Stanford-Binet Intelligence Scale is a standardized test that assesses intelligence and cognitive abilities. Intelligence is "a concept intended to explain why some people perform better than others on cognitive tasks. Intelligence is defined as "the mental abilities needed to select, adapt to, and shape environments. It involves the abilities to profit from experience, solve problems, reason, and successfully meet challenges and achievement goals. Intelligence tests began as a psychologist's solution to a problem faced by Paris schools at the beginning of the century. Alfred Binet, a French psychologist, developed a test to measure potential ability at school tasks rather than performance in school, and to produce the same scores regardless of the personalities or prejudices of those who gave or took the test. The scoring method originally used by Binet and his collaborator, Theodore Simon, was based on the concept of mental age or MA (the chronological age typical of a given level of performance). For the average child, mental age and chronological age are equal or a match. For example, a child who is 10 years of age has a mental age of 10. But some children who have less intelligence than average will not be able to pass all the items suitable to their age level and thus will show an MA that are lower than their CA. To measure mental age, Binet and Simon developed varied reasoning and problem-solving questions that might predict school achievement. Louis Terman (professor at Stanford University) attempted to use Binet's test, but realized that items developed for Parisians did not provide a satisfactory standard for evaluating American children and he revised and standardized the new version of the test (he establi... ... middle of paper ... ...tributes.) Most scores fall near the average, and fewer and fewer scores lie near the extremes- Within each age group, the Stanford-Binet tests assign any person a score according to how much that person's performance deviates above or below the average. Reliability (extent to which a test yields consistent results, as assessed by the consistency of scores on two halves of the test, on alternate forms of the test or on retesting)- Comparing test scores to those of the standardizing group still won't tell us much about the individual unless the test has reliability. Validity is the most important requirement of all. A test must actually measure what it is intended to measure. (Content validity-the extent to which a test samples the behavior that is of interest and Predictive validity- the success with which a test predicts behavior it is designed to predict)
Webster's Collegiate Dictionary defines intelligence as the capacity to apprehend facts and propositions, to reason about them, and the ability to understand them and their relations to each other. A. M. Turing had this definition in mind when he made his predictions and designed his test, commonly known as the Turing test. His test is, in principle, simple. A group of judges converse with different entities, some computers and some human, without knowledge of which is which. The job of the judges is to discern which entity is a computer. Judges may ask them any question they like, "Are you a computer?" excepted, and the participants may answer with anything they like, and in turn, ask questions of the judges. The concept of the test is not difficult, but creating an entity capable of passing the test with current technology is virtually impossible.
Validity refers to ability of an instrument to measure the test scores appropriately, meaningfully, and usefully (Polit& Beck, 2010). The instrument has been developed to serve three major functions: (1) to represent a specific universe of content, (2) to represent measurement of specific psychological attributes, (3) to represent the establishing of a relationship with a particular criterion. There are three types of validity; each type represents a response to one of three functions
Presumably, the most widely known of these measures has been the Scholastic Aptitude Test (now the SAT Reasoning Test, or SAT). Developed by the Educational Testing Service after World War II, the test in many ways was the big idea of James Bryant Conant. Adhering to democratic, classless society, Conant thought that such tests could identify the ability of individuals and ultimately help to equalize educational opportunities (Frontline, 1999). Unfortunately, many have argued that instead of fostering equality, the SATs have become an instrument to separate the social classes, and many in the testing movement were not as magnanimous as James Bryant Conant.
One of the most prominent limitations of age-equivalent scores are the way they are calculated. Age-equivalent scores are essentially estimated or predicted by a group of children that fall within a certain age range that are in the normative sample. Age-equivalent scores are then paired with mean raw scores from a specific age group of children. In result, the age-equivalent scores being used embody a group of individuals that were not tested.
Smith, M. (2010). Why NOT a National Test for Everyone. Kappan, 1. Retrieved March 16, 2014, from www.pdkintl.org
Evans, Donia. "The Case Against Standardized Tests." The Meridian Star. 24 Nov. 2013. The Meridian Star. 01 Dec. 2013 .
The Binet-Simon intelligence scale, which was finally created in 1905, contained problems in an order of increasing difficulty. These items included vocabulary, memory, common knowledge and other cognitive abilities. Binet tests were accepted widely around the world with the exception of France, which basically rejected the test. In In 1908 Binet and Simon revised the test and for each test item, Binet decided whether an average child would be able to get the question right. Thus he was able to differentiate between the chronological age and the mental age of a child. A child's mental age was determined by estimating a child's intelligence through comparison with the scores of average children of the same age.
Similar to Sternberg, Binet came to the conclusion that intelligence is the sum of mental processes (Flangan, Harrison, 2005). He developed the first intelligence test in order to categorize how much children benefitted from school education. The Binet-Simon scale, keeping in mind that Binet believed in intelligence consisting of different components, thus included language component, auditory processing, learning and memory, as well as judgement and problem solving (Kamin, 1974). The results were supposed to identify the student’s mental age. Lewis Terman introduced the Binet-Simon test to America and adapted it to sort army recruits in World War I (Comer et al., 2013). The Stanford-Binet test, developed by Terman in 1916, aimed to be an improved version that was able to measure mental age more appropriately (Kamin, 1974). He was convinced that intelligence is the ability to form concepts and to think abstract (Comer et al., 2013). The Stanford-Binet test has been described by Maud Minton to be superior to other intelligence tests of that time because it was very precise, it had detailed guidelines, it measured the IQ which became the standard marking system (Flangan, Harrison,
How many of us really believe that a child's intelligence, achievement, and confidence can be represented adequately by standardized tests? How can any distribution curve classify all children? What about all we have learned about children?s growth and their response to education? Few teachers and parents would accept that a single test score could define any child (Russel, 2002). We must ask if these tests address the educational concerns of teachers and parents and do they provide useful information about individual children or the class. Almost all teachers feel pressure to teach to the tests and feel that tests clearly limit educational possibilities for students (Russel, 2002). We feel it is detrimental to a child's education to enjoy reading. An article reported by the BBC news (2003) entitle...
Viewing from the similar perspective, Hughes (as cited in Brown, 2004) mentioned that Validity means discovering whether a test ‘measures accurately what it is intended to measure’. And there are five types of evidence of validity (Brown, 2004, p.22): 1. Content validity: any attempt to show that the content of the test is a representative sample from the domain that is to be tested (to achieve content validity in classroom assessment: test performance directly). 2. Criterion-related evidence: is the extent to which the ‘criterion’ of the test has actually been reached.
Cozby & Bates (2015) found that “Reliability is stability or consistency of a measure of behavior.” (p. 101). In measurement of behavior, there are three types of reliability; Test-Retest, Internal Consistency, and Interrater. Test-Retest reliability measures the same individual at two separate points in time. Internal Consistency reliability measures the same individual at only one point in time, and Interrater reliability measures an individual once.
In this world, there are many different individuals who are not only different in demographics but also different neurologically. Due to an immense amount of people it is important to first understand each individual, in order, to better understand them and to help them when it comes to certain areas such as education, the work force, and etc…. For this reason psychologists have aimed to further understand individuals through the use of psychological assessments. This paper aims to examine a particular assessment tool, the Stanford-Binet Intelligence Scales (Fifth Edition), which measures both intelligence and cognitive abilities (Roid, 2003). This assessment is usually administered by psychologists and the scores are most often used to determine placement in academics and services allotted to children and adolescents (despite their compatibility for adults) (Wilson & Gilmore, 2012). Furthermore before the investigation dives into the particulars of the test, such as its strengths and weakness’, it is best to first learn more about the intelligence scales general characteristics.
Validity is defined as the consistency of the measurement results and the quality of the measure or the ability of a test produce comparable results across repeated measurements within the same parameters or conditions (Kaplan & Saccuzzo, 2013; Bordens & Abbott, 2014). In terms of verifying reliability, however, there are basically three different types of evidence that is used to confirm the validity of a test: construct-related evidence, content-related evidence, and criterion-related evidence (Kaplan & Saccuzzo, 2013). Content-related evidence of validity, for instance, is defined as being the type of evidence that identifies the association between the questions or items of measure of a test to the content matter that is being evaluated
Reliability is defined as dependability. Validity is defined as being truthful, fair, or reasonable. Standard one of reliability asks if the material being tested on in familiar to the student being tested. It also asks if the student able to perform the same action, or come up with the same result, using the same material given, multiple times. Standard 2 of reliability asks if there is enough proof that the student can in fact do the skill being tested on. Homework, classwork, and scores made on previous quizzes could help provide proof that the student knows the material or can perform the skill. Standard three of reliability asks if the directions and what is expected from the students is clear to the students being tested. The students
Intelligence by definition is “the ability to acquire and apply knowledge and skills” (Oxford Dictionary, 2014). However, many psychologists argue that there is no standard definition of ‘intelligence’, and there have been many different theories over time as psychologists try to find better ways to define this concept (Boundless 2013). While some believe in a single, general intelligence, others believe that intelligence involves multiple abilities and skills. Another largely debated concept is whether intelligence is genetically determined and fixed, or whether is it open to change, through learning and environmental influence. This is commonly known as the nature vs. nurture debate.