Essay Color Key

Free Essays
Unrated Essays
Better Essays
Stronger Essays
Powerful Essays
Term Papers
Research Papers





Statistical Investigation

Rate This Paper:

Length: 2172 words (6.2 double-spaced pages)
Rating: Red (FREE)      
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Statistical Investigation

Introduction

I have chosen to do this statistical coursework that uses data from
'Mayfield High School.' Although this is a fictitious school, the data
is based on a real school. As the data has been collected for me, it
is called secondary data.

I believe that this coursework will allow me to illustrate my ability
to handle data, use specific techniques and apply higher level
statistical maths by being able to use a variety of methods in order
to analyse and compare sets of data. During this project I will be
examining the relationships between the attributes of the pupils of
Mayfield High School. My aim is took produce a line of enquiry which
has two or more statistics regarding the pupils which are related to
each other.

This table shows how many boys and girls there are in each year group
at Mayfield High.

Year Group

Number of Boys

Number of Girls

Total

7

150

150

300

8

145

125

270

9

120

140

260

10

100

100

200

11

84

86

170

The total Number of students at the school is 1200

Data is provided for each pupil in the following categories:

* Name

* Age

* Year Group

* IQ

* Weight

* Height

* Hair colour

* Eye colour

* Shoe size

* Distance from home to school

* Usual method if travel to school

* Number of Brothers or sisters

* Key stage 2 & 3 results in English, Mathematics and Science

From the abovementioned, I need to pick several types of data to base
my investigation on. However, I have decided to pick only two (at the
maximum 3) pieces of data, as time is a limiting factor in this
coursework. When deciding my data categories, there are a few things
that I need to bear in mind. I need to use quantitative data, so I am
able to apply all higher level statistical maths to my results. I also
need to make sure that the data I choose are closely related, so I can
analyse my results thoroughly.

There are several lines of enquiry at this point that I may wish to
follow up. These are:

· The relationship between IQ and Key stage 3 English results

· The relationship between height and weight

· The relationship between shoe size and height

Through basic observations of the people in my surroundings, I believe
that there may be a strong relationship between a person's height and
weight, not only with people in general, but between separate genders.
However, I also feel that age is an affecting factor, and intend to
look into that later on in the coursework. I have made this decision
based on the fact that each of these pieces of data is interrelated
and they are continuous (quantitative).

As previously stated, my line of enquiry will be the relationship
between height and weight (with the introduction of age). I predict
that there are several hypothesis that are related to this
investigation.

* Boys will be taller than girls

* As height increases, so does weight

* Girls are heavier than boys

However, you must also take into consideration that relationships will
be different when genders are treated separately.

In order to collect the data, it would take too much time and energy
to unnecessarily include every person from the whole school.
Therefore, a type of sample is needed. I have decided to take a sample
rather than use the whole of the population, as it is quicker to take
samples than to collect information from the whole population. Because
time is a limiting factor, sampling will help me very much. It is
important to choose the sample without bias so that the results will
represent the whole population. There are many types of sampling, and
I now need to find out which type suits my investigation best.

Random Sampling

In a random sample, every member of the population has a chance of
being selected.

* Advantages: Every member of the population has a chance of being
selected.

* Disadvantages: Due to its unpredictability, anomalous results can
sometimes be obtained that are not representative of the
population. In addition, these irregular results may be difficult
to spot. For our purposes, there won't be the same amount from
each year and equal amounts of both genders.

Systematic Sampling

In a systematic sample, every member of the sample is chosen at
regular intervals from the list.

* Advantages: Can eliminate some sources of bias

* Disadvantages: Can introduce bias where the pattern used for the
samples coincides with a pattern in the population. For our
purposes, there is a guarantees representative sample of year
groups but not of gender

Stratified Sampling

A population may contain separate groups or strata. Each group needs
to be fairly represented in the sample. The number from each group is
proportional to the group size. The selection is then made at random
from each group.

* This form of sampling will work well for our purposes

Quota Sampling

As with stratified samples, the population is broken down into
different categories. However, the size of the sample of each category
does not reflect the population as a whole. This can be used where an
unrepresentative sample is desirable (e.g. you might want to interview
more children than adults for a survey on computer games), or where it
would be too difficult to undertake a stratified sample.

* Advantages: Simpler to undertake than a stratified sample.
Sometimes a deliberately biased sample is desirable

* Disadvantages: Not a genuine random sample, and is likely to yield
a biased result. For our purposes it is not very reliable because
it depends on the interviewer to choose the sample

Cluster Sampling

Used when populations can be broken down into many different
categories, or clusters (e.g. church parishes). Rather than taking a
sample from each cluster, a random selection of clusters is chosen to
represent the whole. Within each cluster, a random sample is taken.

* Advantages: Less expensive and time consuming than a fully random
sample. Can show "regional" variations.

* Disadvantages: Not a genuine random sample. Likely to yield a
biased result (especially if only a few clusters are sampled).

After looking at all of the advantages and disadvantages of each types
of sampling, I have chosen to use stratified sampling, as this form of
sampling will work well for our purposes. The reasons are stated
above.

As I have now decided on my line of enquiry and type of sampling, I
now need to decide how big my sample size will be. As different sizes
of sample will affect the reliability of my results and conclusions,
it is imperative that I make the correct choice when deciding the size
of my sample.

The bigger a sample, the more useful the data will be. I you select a
lot of people, your results will be closer to the actual results for
the whole school. However, if you choose too many people the data
becomes too difficult to analyze and takes too long to collate and
sort. 5 - 10% is usually a fair representation of population, so I
have decided to use a 9% sample, which is 54 people. In my opinion, I
think this will be a good representation of population and is also a
reasonable figure to manage.

[IMAGE]

[IMAGE]

Outlier

[IMAGE]When collecting my data, I need to check for outliers and
anomalies. I will need to check my sampled data for untypical values
which appear to lie outside the general range. (E.g. weight: 1kg/600kg
and height: 0.01m/10m) Once I present my results in a graph it will be
easy to see where the outlier resides:

If these outliers were included in my calculations or graphs they
would distort the data, disrupt the correlation of graphs, and
therefore effect my conclusion, and whether or not my hypothesis is
correct. This is why it is crucial that I disregard any information
that is blatantly incorrect.

Sampling Method (In Detail)

In order to produce my results, I need to know how my sampling method
works.

1. Count boys and girls per year group

2. Work out sample size

3. Find the fraction of pupils in each year

4. Find how many people there are in each year out of 54 (9% sample)

5. Use same method to calculate amount of girls and boys in each
year for sample

6. Use random sampling to choose correct number of boys and girls
per year group and enter results in tables

7. Identify and anomalous data/outliers. Reselect data item

Mathematical Techniques

In order to thoroughly analyze and evaluate my data, there are many
mathematical techniques, diagrams and graphs I will need to use. Here
is a list of them:

Diagrams:

1. Histograms - A histogram is constructed from a frequency table.
The intervals are shown on the X-axis and the number of scores in
each interval is represented by the height of a rectangle located
above the interval.

2. Box Plots - A box plot provides an excellent visual summary of
many important aspects of a distribution. The box stretches from
the lower quartile to the upper quartile and therefore contains
the middle half of the scores in the distribution. The median is
shown as a line across the box. Therefore 1/4 of the distribution
is between this line and the top of the box and 1/4 of the
distribution is between this line and the bottom of the box.

3. Scatter Diagram - A type of diagram used to show the relationship
between data items that have two numeric properties. One property
is represented along the x-axis and the other along the y-axis.
Each item is then represented by a single point.

4. Cumulative Frequency Graphs - A cumulative frequency graph can be
used to estimate some useful statistical measures.

5. Line Of Best Fit - Single line drawn through a series of data
points as a best representation of the underlying trend. Can be a
straight line or a curve.

Calculations:

1. Mean

2. Mode

3. Median

4. Mean & Modal Class for Grouped Continuous Data - This calculates
the mean for grouped continuous data.

5. InterquartileRange- The distance between the upper and lower
quartiles. As a measure of variability, it is less sensitive than
the standard deviation or range to the possible presence of
outliers. It is also used to define the box in a box-and-whisker
plot.

6. Standard Deviation - It is the most commonly used measure of
spread.

7. Normal distribution - Normal distributions are a family of
distributions that have the same general shape. They are symmetric
with scores more concentrated in the middle than in the tails.
Normal distributions are sometimes described as bell shaped.

8. Spearman's Rank Correlation Coefficient - The Spearman's Rank
Correlation Coefficient is used to discover the strength of a link
between two sets of data.

9. Equation of Line of Best fit - Equation of line that shows
underlying spread.

Collecting the Data

In order to find my results, I will need to sort the data and put it
into tables. As I am using stratified sampling, I have had to count up
the amount of boys and girls in each year and work out my sample size.
Once I have done this, I will record my results in two separate tables
(one for males, one for females), in year order. From there, I will
then create separate tables for each year and then create 1 large
mixed table. After I have finished sorting out the tables, I will then
do various scatter diagrams. Firstly, one for males one for females,
mixed and then one for each year (for both mixed and separate
genders).

Finding the Results

As I have previously stated, I have decided to use a samples size of
9%, which in total is 54 people. I now need to apply that information
to the investigation and work out my sample for each year, gender etc.

Data:

Year

Boys

Girls

Total

7

75

75

150

8

65

70

135

9

62

68

130

10

51

49

100

11

41

44

85

Total

600

Sample size : 9% of 600 = 54

Now, I have to calculate how many pupils to examine within each year,
because each year group varies in total amount of students. I will
calculate the proportion of pupils from each of the year groups.

Stratified Sample:

Year

Fraction of population

/54

No. Of Girls in Sample

No. of Boys in Sample

7

150/600= 0.25

13.5

75/150 x 13.5 = 6.75 (7)

75/150 x 13.5 = 6.75 (7)

8

135/600= 0.225

12.2

70/135 x 12.2 = 6.32 (6)

65/135 x 12.2 = 5.87 (6)

9

130/600= 0.2166666

11.7

68/130 x 11.7 = 6.12 (6)

62/130 x 11.7 = 5.58 (6)

10

100/600= 0.1666666

9

49/100 x 9 = 4.41 (4)

51/100 x 9 = 4.59 (5)

11

85/600 = 0.1416666

7.6

44/85 x 7.6 = 3.93 (4)

41/85 x 7.6 = 3.67 (4)

Due to rounding, my sample size has been adjusted from 54 to 55. Given
as a percentage, this would be:

55/600 x 100 = 9.166666667

= 9.2%

I now need to randomly select, within the specified year and gender,
the designated amount for each category. I will do this by using the
random function on my calculator. I need to make sure the results are
random, so that they will not be biased. Once I have done this, I need
to check for any anomalies in my selected pupils' weight/height.


Boys

Year

Height (cm)

Weight (kg)

1

7

1.48

44

2

7

1.59

52

3

7

1.49

43

4

7

1.52

45

5

7

1.54

43

6

7

1.55

40

7

7

1.59

45

8

8

1.57

48

9

8

1.67

51

10

8

1.71

46

11

8

1.66

43

12

8

1.59

47

13

8

1.42

40

14

9

1.67

54

15

9

1.8

48

16

9

1.75

63

17

9

1.46

45

18

9

1.5

70

19

9

1.82

66

20

10

1.8

49

21

10

1.6

50

22

10

1.62

52

23

10

1.65

50

24

10

1.77

59

25

11

1.91

82

26

11

1.62

56

27

11

1.74

50

28

11

2

86

Results

Girls

Year

Height (cm)

Weight (kg)

1

7

1.61

45

2

7

1.61

47

3

7

1.56

43

4

7

1.48

42

5

7

1.5

40

6

7

1.56

53

7

7

1.58

48

8

8

1.72

43

9

8

1.62

53

10

8

1.62

54

11

8

1.6

46

12

8

1.75

45

13

8

1.48

46

14

9

1.57

38

15

9

1.62

54

16

9

1.64

40

17

9

1.6

46

18

9

1.8

60

19

9

1.6

51

20

10

1.52

45

21

10

1.72

56

22

10

1.66

45

23

10

1.73

42

24

11

1.7

50

25

11

1.68

48

26

11

1.52

38

27

11

1.62

48

Organising My Results

Although I have already presented my results into 2 separate tables,
one for each gender, the results are not concise enough. In order to
fully analyse my results, I will need to put my results into scatter
diagrams and histograms etc. Therefore, my results need to be grouped
into around 5-8 groups, which are the same for both genders. This is
because when I put my results into the scatter diagrams (etc), I will
need to compare both genders, thus requiring me to use the same groups
for both sexes. Once I have chosen my groups, I will enter the
information into the frequency tables and use those for me histograms
and scatter diagrams.

How to Cite this Page

MLA Citation:
"Statistical Investigation." 123HelpMe.com. 23 Jul 2014
    <http://www.123HelpMe.com/view.asp?id=122728>.




Related Searches





Important Note: If you'd like to save a copy of the paper on your computer, you can COPY and PASTE it into your word processor. Please, follow these steps to do that in Windows:

1. Select the text of the paper with the mouse and press Ctrl+C.
2. Open your word processor and press Ctrl+V.

Company's Liability

123HelpMe.com (the "Web Site") is produced by the "Company". The contents of this Web Site, such as text, graphics, images, audio, video and all other material ("Material"), are protected by copyright under both United States and foreign laws. The Company makes no representations about the accuracy, reliability, completeness, or timeliness of the Material or about the results to be obtained from using the Material. You expressly agree that any use of the Material is entirely at your own risk. Most of the Material on the Web Site is provided and maintained by third parties. This third party Material may not be screened by the Company prior to its inclusion on the Web Site. You expressly agree that the Company is not liable or responsible for any defamatory, offensive, or illegal conduct of other subscribers or third parties.

The Materials are provided on an as-is basis without warranty express or implied. The Company and its suppliers and affiliates disclaim all warranties, including the warranty of non-infringement of proprietary or third party rights, and the warranty of fitness for a particular purpose. The Company and its suppliers make no warranties as to the accuracy, reliability, completeness, or timeliness of the material, services, text, graphics and links.

For a complete statement of the Terms of Service, please see our website. By obtaining these materials you agree to abide by the terms herein, by our Terms of Service as posted on the website and any and all alterations, revisions and amendments thereto.



Return to 123HelpMe.com

Copyright © 2000-2013 123HelpMe.com. All rights reserved. Terms of Service