Regression of Baseball Player Salaries

943 Words2 Pages

Introduction The Major League Baseball (MLB) organization is a group of baseball teams that have made it to the Major League. The Major League Baseball data set provides the 2005 salaries of multiple Major League Baseball (MLB) teams as well as individual salaries of players within 30 teams (Lind, Marchal & Wathen, 2008). The MLB data set gives information such as batting averages, wins, salaries, home runs, errors, etc (Lind, Marchal & Wathen, 2008). Two specific teams stand out of the information when looking at their stats; St. Louis and Kansas City. These two teams are drastically different; one has the most wins out of the MLB data set, and the other has the least wins. With St. Louis and Kansas City both being in the major league, they are to be considered good, which makes us wonder if salaries play a part of one team doing better than another. We will look at the team scores as well as individual scores within the two teams to research if salaries affect the quality of performance. In this paper we will conduct a regression test of whether salaries affect the performances of St. Louis and Kansas City. Hypothesis Statement There are many differences in the two samples from the data set; we begin with the National and American league. In our data set the salary affects the performance of players based on the wins and losses. How does the salary affect the teams’ batting average? How does the salary affect the teams ERA? Kansas City has a salary of 36.9 million and their batting average is 0.263 and the ERA is 5.49. St. Louis has a salary of 92.1 million and their batting average is 0.270 and the ERA is 3.49. Is there a correlation between the batting average and ERA based on the salary each team has? In the data set th... ... middle of paper ... ... ANOVA, is a procedure in which the total variability of a random variable is subdivided into components so that it can be better understood, or attributed to each of the various sources that cause the number to vary. Applied to regression parameters, ANOVA techniques are used to determine the usefulness in a regression model, and the degree to which changes in an independent variable X can be used to explain changes in a dependent variable Y. For example, we can conduct a hypothesis-testing procedure to determine whether slope coefficients are equal to zero (the variables are unrelated), or if there is statistical meaning to the relationship (the slope b is different from zero). An F-test can be used for this process. Conclusion References Lind, Marchal, and Wathen. (2008). Statistical Techniques in Business & Economics, 13th Edition. New York, NY: McGraw-Hill

Open Document