# Statistical Investigation

• Length: 2172 words (6.2 double-spaced pages)
• Rating: Excellent

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

#### Text Preview

Statistical Investigation

Introduction

I have chosen to do this statistical coursework that uses data from
'Mayfield High School.' Although this is a fictitious school, the data
is based on a real school. As the data has been collected for me, it
is called secondary data.

I believe that this coursework will allow me to illustrate my ability
to handle data, use specific techniques and apply higher level
statistical maths by being able to use a variety of methods in order
to analyse and compare sets of data. During this project I will be
examining the relationships between the attributes of the pupils of
Mayfield High School. My aim is took produce a line of enquiry which
has two or more statistics regarding the pupils which are related to
each other.

This table shows how many boys and girls there are in each year group
at Mayfield High.

Year Group

Number of Boys

Number of Girls

Total

7

150

150

300

8

145

125

270

9

120

140

260

10

100

100

200

11

84

86

170

The total Number of students at the school is 1200

Data is provided for each pupil in the following categories:

* Name

* Age

* Year Group

* IQ

* Weight

* Height

* Hair colour

* Eye colour

* Shoe size

* Distance from home to school

* Usual method if travel to school

* Number of Brothers or sisters

* Key stage 2 & 3 results in English, Mathematics and Science

From the abovementioned, I need to pick several types of data to base

## Need Writing Help?

Get feedback on grammar, clarity, concision and logic instantly.

MLA Citation:
"Statistical Investigation." 123HelpMe.com. 23 May 2018
<http://www.123HelpMe.com/view.asp?id=122728>.
Title Length Color Rating
- Statistical Investigation on Car Prices Data Collection: We are asked to find the relationship between prices of Â‘51 registeredÂ’ cars and variables such as their current new price, mileage and weight by using the regression analysis. To collect full information of these registered cars, we can access to the websites like: http://www.exchangeandmart.co.ukHowever, in this easy, we use the data, which are provided with the assignment by our teachers. By the sources, 100 cars, which manufacturers include Toyota, Honda, Nissan, Suzuki, are obtained to be sampled from....   [tags: Papers] 1938 words
(5.5 pages)
Strong Essays
- Statistical Investigation Between Students AIM === In this investigation I have set out to explore two hypotheses regarding secondary data supplied to me concerning pupils at Mayfield High School in years 7 to 11. The secondary data supplied included data on a range of issues including each pupil's weight, their IQ and the average number of hours each pupil watched TV per week. My hypotheses below explore specifically these elements from the secondary data supplied to me....   [tags: Papers] 1630 words
(4.7 pages)
Strong Essays
- Statistical Investigation Introduction I have chosen to do this statistical coursework that uses data from 'Mayfield High School.' Although this is a fictitious school, the data is based on a real school. As the data has been collected for me, it is called secondary data. I believe that this coursework will allow me to illustrate my ability to handle data, use specific techniques and apply higher level statistical maths by being able to use a variety of methods in order to analyse and compare sets of data....   [tags: Papers] 2172 words
(6.2 pages)
Strong Essays
- Statistical Investigation Here in this statistical piece of course work we are given a spreadsheet with loads of information about students in Mayfield High. This spreadsheet was provided to us by the school. Below are the headings that are available on the spreadsheet. * Name * Age * School year * Hair colour * Eye colour * Favorite colour * Favorite sport * Favorite Subject * Favorite TV program * Number of TV hours per week * IQ * Height & Weight * Distance from school * Means of transport to school * No....   [tags: Papers] 2494 words
(7.1 pages)
Strong Essays
- Statistical Investigation In this coursework I am going to investigate the affect that age has on the car. I am going to look at price, engine size, mileage, and age of the car. By the end of the coursework I am aiming to have a set of results about how the cars are affected by the age, price and mileage. My prediction should show that Ÿ As the car increases its price will decrease Ÿ The higher the mileage the price will decrease. Plan I am going to be collecting a sample of 100 cars....   [tags: Papers] 919 words
(2.6 pages)
Strong Essays
- Statistical Investigation All graphs should be attached. An Investigation into the Relationships between the Heights and Weights of a Rangeof Studentsat Mayfield High Aim I aim to investigate the relationship between the heights and weights of a range of students from different demographics at "Mayfield High". Prediction I. I predict initially that the average male height will be greater than the average female height. That is due to biology stating that males are, on average, taller than females....   [tags: Papers] 3796 words
(10.8 pages)
Strong Essays
- Statistical Investigation I have been approached by my local council who would like to try and gain a better understanding of obesity in the local community. The body mass index is the universal measurement of whether a person is overweight. They are trying to decrease the number of future cases of obesity and therefore, I must look at the future generation of adults, so I will take this case back to its origin, I will take it back to school. All the data that I require for this project will be provided by the council....   [tags: Papers] 3593 words
(10.3 pages)
Strong Essays
- Statistical Investigation For my Maths GCSE's I am required to do two pieces of coursework one of which is a piece on statistics. The purpose of this project is to help me investigate a real life situation using statistics. In this case I have been given some quantitative data on the end of year maths test scores of children in year 8. What I am supposed to do is compare the data by making hypotheses to indicate if there is something comparable between the results. First I have decided it would be best to make my hypotheses....   [tags: Papers] 2980 words
(8.5 pages)
Strong Essays
- Statistical Investigation Into Height and Weight of Students My hypotheses are as follows: ~ 1. Year 11 students are, on average, taller than year 9 students. 2. There is better correlation between height and weight in year 7 than there is in year 11. 3. The taller someone is the heavier they are. Below are the sampling methods that I have used in my coursework: Stratified Simple Random Stratified sampling can be defined as the process where the population is divided into a number of sub-groups, e.g....   [tags: Papers] 3097 words
(8.8 pages)
Strong Essays
- Quality management frequently uses statistical methods to identify the existence of a quality problem and to analyze the root cause of the problem. Statistical methods require the collection of numerical data related to a process under investigation. The data can then be used to identify trends that can affect quality such as the rate of variance in the outcomes of a production process. The descriptive or inferential analysis of the statistical methods can also provide information about the most likely causes of the problem....   [tags: histogram, inferencial statistics, data]
:: 5 Works Cited
902 words
(2.6 pages)
Better Essays

### Related Searches

my investigation on. However, I have decided to pick only two (at the
maximum 3) pieces of data, as time is a limiting factor in this
coursework. When deciding my data categories, there are a few things
that I need to bear in mind. I need to use quantitative data, so I am
able to apply all higher level statistical maths to my results. I also
need to make sure that the data I choose are closely related, so I can
analyse my results thoroughly.

There are several lines of enquiry at this point that I may wish to

· The relationship between IQ and Key stage 3 English results

· The relationship between height and weight

· The relationship between shoe size and height

Through basic observations of the people in my surroundings, I believe
that there may be a strong relationship between a person's height and
weight, not only with people in general, but between separate genders.
However, I also feel that age is an affecting factor, and intend to
look into that later on in the coursework. I have made this decision
based on the fact that each of these pieces of data is interrelated
and they are continuous (quantitative).

As previously stated, my line of enquiry will be the relationship
between height and weight (with the introduction of age). I predict
that there are several hypothesis that are related to this
investigation.

* Boys will be taller than girls

* As height increases, so does weight

* Girls are heavier than boys

However, you must also take into consideration that relationships will
be different when genders are treated separately.

In order to collect the data, it would take too much time and energy
to unnecessarily include every person from the whole school.
Therefore, a type of sample is needed. I have decided to take a sample
rather than use the whole of the population, as it is quicker to take
samples than to collect information from the whole population. Because
time is a limiting factor, sampling will help me very much. It is
important to choose the sample without bias so that the results will
represent the whole population. There are many types of sampling, and
I now need to find out which type suits my investigation best.

Random Sampling

In a random sample, every member of the population has a chance of
being selected.

* Advantages: Every member of the population has a chance of being
selected.

* Disadvantages: Due to its unpredictability, anomalous results can
sometimes be obtained that are not representative of the
population. In addition, these irregular results may be difficult
to spot. For our purposes, there won't be the same amount from
each year and equal amounts of both genders.

Systematic Sampling

In a systematic sample, every member of the sample is chosen at
regular intervals from the list.

* Advantages: Can eliminate some sources of bias

* Disadvantages: Can introduce bias where the pattern used for the
samples coincides with a pattern in the population. For our
purposes, there is a guarantees representative sample of year
groups but not of gender

Stratified Sampling

A population may contain separate groups or strata. Each group needs
to be fairly represented in the sample. The number from each group is
proportional to the group size. The selection is then made at random
from each group.

* This form of sampling will work well for our purposes

Quota Sampling

As with stratified samples, the population is broken down into
different categories. However, the size of the sample of each category
does not reflect the population as a whole. This can be used where an
unrepresentative sample is desirable (e.g. you might want to interview
more children than adults for a survey on computer games), or where it
would be too difficult to undertake a stratified sample.

* Advantages: Simpler to undertake than a stratified sample.
Sometimes a deliberately biased sample is desirable

* Disadvantages: Not a genuine random sample, and is likely to yield
a biased result. For our purposes it is not very reliable because
it depends on the interviewer to choose the sample

Cluster Sampling

Used when populations can be broken down into many different
categories, or clusters (e.g. church parishes). Rather than taking a
sample from each cluster, a random selection of clusters is chosen to
represent the whole. Within each cluster, a random sample is taken.

* Advantages: Less expensive and time consuming than a fully random
sample. Can show "regional" variations.

* Disadvantages: Not a genuine random sample. Likely to yield a
biased result (especially if only a few clusters are sampled).

After looking at all of the advantages and disadvantages of each types
of sampling, I have chosen to use stratified sampling, as this form of
sampling will work well for our purposes. The reasons are stated
above.

As I have now decided on my line of enquiry and type of sampling, I
now need to decide how big my sample size will be. As different sizes
of sample will affect the reliability of my results and conclusions,
it is imperative that I make the correct choice when deciding the size
of my sample.

The bigger a sample, the more useful the data will be. I you select a
lot of people, your results will be closer to the actual results for
the whole school. However, if you choose too many people the data
becomes too difficult to analyze and takes too long to collate and
sort. 5 - 10% is usually a fair representation of population, so I
have decided to use a 9% sample, which is 54 people. In my opinion, I
think this will be a good representation of population and is also a
reasonable figure to manage.

[IMAGE]

[IMAGE]

Outlier

[IMAGE]When collecting my data, I need to check for outliers and
anomalies. I will need to check my sampled data for untypical values
which appear to lie outside the general range. (E.g. weight: 1kg/600kg
and height: 0.01m/10m) Once I present my results in a graph it will be
easy to see where the outlier resides:

If these outliers were included in my calculations or graphs they
would distort the data, disrupt the correlation of graphs, and
therefore effect my conclusion, and whether or not my hypothesis is
correct. This is why it is crucial that I disregard any information
that is blatantly incorrect.

Sampling Method (In Detail)

In order to produce my results, I need to know how my sampling method
works.

1. Count boys and girls per year group

2. Work out sample size

3. Find the fraction of pupils in each year

4. Find how many people there are in each year out of 54 (9% sample)

5. Use same method to calculate amount of girls and boys in each
year for sample

6. Use random sampling to choose correct number of boys and girls
per year group and enter results in tables

7. Identify and anomalous data/outliers. Reselect data item

Mathematical Techniques

In order to thoroughly analyze and evaluate my data, there are many
mathematical techniques, diagrams and graphs I will need to use. Here
is a list of them:

Diagrams:

1. Histograms - A histogram is constructed from a frequency table.
The intervals are shown on the X-axis and the number of scores in
each interval is represented by the height of a rectangle located
above the interval.

2. Box Plots - A box plot provides an excellent visual summary of
many important aspects of a distribution. The box stretches from
the lower quartile to the upper quartile and therefore contains
the middle half of the scores in the distribution. The median is
shown as a line across the box. Therefore 1/4 of the distribution
is between this line and the top of the box and 1/4 of the
distribution is between this line and the bottom of the box.

3. Scatter Diagram - A type of diagram used to show the relationship
between data items that have two numeric properties. One property
is represented along the x-axis and the other along the y-axis.
Each item is then represented by a single point.

4. Cumulative Frequency Graphs - A cumulative frequency graph can be
used to estimate some useful statistical measures.

5. Line Of Best Fit - Single line drawn through a series of data
points as a best representation of the underlying trend. Can be a
straight line or a curve.

Calculations:

1. Mean

2. Mode

3. Median

4. Mean & Modal Class for Grouped Continuous Data - This calculates
the mean for grouped continuous data.

5. InterquartileRange- The distance between the upper and lower
quartiles. As a measure of variability, it is less sensitive than
the standard deviation or range to the possible presence of
outliers. It is also used to define the box in a box-and-whisker
plot.

6. Standard Deviation - It is the most commonly used measure of

7. Normal distribution - Normal distributions are a family of
distributions that have the same general shape. They are symmetric
with scores more concentrated in the middle than in the tails.
Normal distributions are sometimes described as bell shaped.

8. Spearman's Rank Correlation Coefficient - The Spearman's Rank
Correlation Coefficient is used to discover the strength of a link
between two sets of data.

9. Equation of Line of Best fit - Equation of line that shows

Collecting the Data

In order to find my results, I will need to sort the data and put it
into tables. As I am using stratified sampling, I have had to count up
the amount of boys and girls in each year and work out my sample size.
Once I have done this, I will record my results in two separate tables
(one for males, one for females), in year order. From there, I will
then create separate tables for each year and then create 1 large
mixed table. After I have finished sorting out the tables, I will then
do various scatter diagrams. Firstly, one for males one for females,
mixed and then one for each year (for both mixed and separate
genders).

Finding the Results

As I have previously stated, I have decided to use a samples size of
9%, which in total is 54 people. I now need to apply that information
to the investigation and work out my sample for each year, gender etc.

Data:

Year

Boys

Girls

Total

7

75

75

150

8

65

70

135

9

62

68

130

10

51

49

100

11

41

44

85

Total

600

Sample size : 9% of 600 = 54

Now, I have to calculate how many pupils to examine within each year,
because each year group varies in total amount of students. I will
calculate the proportion of pupils from each of the year groups.

Stratified Sample:

Year

Fraction of population

/54

No. Of Girls in Sample

No. of Boys in Sample

7

150/600= 0.25

13.5

75/150 x 13.5 = 6.75 (7)

75/150 x 13.5 = 6.75 (7)

8

135/600= 0.225

12.2

70/135 x 12.2 = 6.32 (6)

65/135 x 12.2 = 5.87 (6)

9

130/600= 0.2166666

11.7

68/130 x 11.7 = 6.12 (6)

62/130 x 11.7 = 5.58 (6)

10

100/600= 0.1666666

9

49/100 x 9 = 4.41 (4)

51/100 x 9 = 4.59 (5)

11

85/600 = 0.1416666

7.6

44/85 x 7.6 = 3.93 (4)

41/85 x 7.6 = 3.67 (4)

Due to rounding, my sample size has been adjusted from 54 to 55. Given
as a percentage, this would be:

55/600 x 100 = 9.166666667

= 9.2%

I now need to randomly select, within the specified year and gender,
the designated amount for each category. I will do this by using the
random function on my calculator. I need to make sure the results are
random, so that they will not be biased. Once I have done this, I need
to check for any anomalies in my selected pupils' weight/height.

Boys

Year

Height (cm)

Weight (kg)

1

7

1.48

44

2

7

1.59

52

3

7

1.49

43

4

7

1.52

45

5

7

1.54

43

6

7

1.55

40

7

7

1.59

45

8

8

1.57

48

9

8

1.67

51

10

8

1.71

46

11

8

1.66

43

12

8

1.59

47

13

8

1.42

40

14

9

1.67

54

15

9

1.8

48

16

9

1.75

63

17

9

1.46

45

18

9

1.5

70

19

9

1.82

66

20

10

1.8

49

21

10

1.6

50

22

10

1.62

52

23

10

1.65

50

24

10

1.77

59

25

11

1.91

82

26

11

1.62

56

27

11

1.74

50

28

11

2

86

Results

Girls

Year

Height (cm)

Weight (kg)

1

7

1.61

45

2

7

1.61

47

3

7

1.56

43

4

7

1.48

42

5

7

1.5

40

6

7

1.56

53

7

7

1.58

48

8

8

1.72

43

9

8

1.62

53

10

8

1.62

54

11

8

1.6

46

12

8

1.75

45

13

8

1.48

46

14

9

1.57

38

15

9

1.62

54

16

9

1.64

40

17

9

1.6

46

18

9

1.8

60

19

9

1.6

51

20

10

1.52

45

21

10

1.72

56

22

10

1.66

45

23

10

1.73

42

24

11

1.7

50

25

11

1.68

48

26

11

1.52

38

27

11

1.62

48

Organising My Results

Although I have already presented my results into 2 separate tables,
one for each gender, the results are not concise enough. In order to
fully analyse my results, I will need to put my results into scatter
diagrams and histograms etc. Therefore, my results need to be grouped
into around 5-8 groups, which are the same for both genders. This is
because when I put my results into the scatter diagrams (etc), I will
need to compare both genders, thus requiring me to use the same groups
for both sexes. Once I have chosen my groups, I will enter the
information into the frequency tables and use those for me histograms
and scatter diagrams.