Data Handling
In this coursework, I shall investigate whether there is a correlation
between the height and weight of students at fictional school,
Mayfield High. I expect there to be a positive correlation between the
height and the weight of the students as it would make sense if the
taller someone was, the heavier they would be. I am going to
investigate the differences between each year, and between male and
female overall. I am doing this for a number of reasons. Firstly, the
range in the higher years is likely to be lower because a lot of
people have finished growing, or have passed their optimum growth
rate. Therefore the data is likely to be more close together. I also
feel it would be interesting to see if there is a difference between
the correlations of males and females and what these differences are.
To help me do this, I am going to create scatter graphs, histograms
and box and whisker plots, as I feel that this will help my
investigation a lot. I will use scatter graphs to compare all of my
data and find correlations and standard deviation. I will use my
histograms and box and whisker plots to investigate further the weight
differences between each year and the boys and girls. I will find the
quartiles and the median etc.
WHY SAMPLE?
I could do this investigation using all 1183 pieces of data, however
this would be extremely time consuming. I therefore am going to take a
random stratified sample. If I make my sample large enough I am
confident that the results will give a good indication of the results
as a whole. By sampling, I simplify my calculations and graphs
dramatically. However I must make sure that the sample I take is
completely unbiased otherwise my results will be corrupt.
TAKING THE SAMPLE
To see how many pieces of data I need to sample, I used the following
formula:
Number of students in group x Amount of data being stratified
The size of the sample must be quite small, because it is stated so in
Then, we would collect a sample of data from a number of firms regarding sales and average height of employees.
In statistics, a population is a collection of individuals, things, events, etc. The population is the topic that one wants to make inferences on, whereas a sample is a subset of the population that is being collected—to be studied. After the sample is studied in statistics, one draws an inference of the population. There are four general sampling methods used in statistics: representative sample, random sample and quasi-random sample, stratified and quota sample, convenience sample, and purposive sample. A representative sample should be unbiased and thus properly indicate a characteristic of the entire population. In a random sample nothing is biased; in other words, every individual, thing or event in the population has the same chance of being selected for the sample. Therefore, because of the randomness of the sampling, the selection of one item from the population in no way effects the selection of another item. A quasi-random sample is simply a number (nth), which is
One of the biggest problems that affect everyone is data aggregation. The more the technology develop, the powerful and dangerous it gets. Today there are many companies that aggregate a lot of information about us. Those companies gathering our data from different sources, which create a detailed record about us. Since all services have been computerized whether it is handled directly or indirectly through computers, there is no way to hide your information. We used computers, because they are faster, better, and accurate more that any human being. It solved many problems; however, it created new ones. Data does not means anything if it stands alone, because it is only recoded facts and figure, yet when it organized and sorted, it become information. These transformed information. Data aggregation raises many questions such as, who is benefiting from data aggregation? What is the impact on us (the users)? In this paper I will discuses data aggregation and the ethics and legal issues that affect us.
There are approximately 350 PDA Waiver consumers that receive services through my agency. I am going to do a random sample of the consumers that have been receiving services for more than one month. I will randomly choose 75 names from the list of PDA Waiver consumers that have been receiving services for at least one month.
"The sample size was determined by redundancy, that is, the sampling and data collection continued until the data collected were repetitive and further sampling and data collection were deemed unlikely to provide additional insights" (Mellor,D. 2010,475).
Answer: The fact that an investigation of local restaurants was conducted in which 150 were selected randomly indicates that this is a sample. This sample indicates that out of the 150 randomly selected, 42% of this random selection out of the total population of restaurants possessed series health code violations.
We then gave the cup back and drew 5 pennies, 5 times, with shaking the cup in between trials. On our draw-36, we drew 36 pennies and returned the contents of the cup back to the population bag, which was shaken and received a new sample set 5 times. When adding the data into the spreadsheet we added with our own data collected, older data, which raised our number of trials. We then collected the min’s, max’s and mean’s for each category in the sample categories we also included the min of means, the max of means, the range of means, mean of means, standard deviation of means and the central limit thermos and the width that would be used for the graphs. While the population included the minimum, maximum, range, width, average, and the population standard deviation.
Due to the invisibility of the population, a sampling frame can not be developed. Without the ...
Data acquisition is the process of copying data. For computer forensics, it’s the task of collecting digital evidence from electronic media. There are two types of data acquisition: static acquisitions and live acquisitions.
"Although fully searchable text could, in theory, be retrieved without much metadata in the future, it is hard to imagine how a complex or multimedia digital object that goes into storage of any kind could ever survive, let alone be discovered and used, if it were not accompanied by good metadata" (Abby Smith). Discuss Smith's assertion in the context of the contemporary information environment
Often uses random sampling to select a large statistically representative sample from which generalizations can be drawn.
-If the number of animals recaptured in the second sample (n2) is less than 8, the estimation of the population is likely to be biased.
The key to good research is preparation, preparation, and preparation. Hence, the key to making good sampling choices is preparation. Trochim (2008) defines sampling as the drawing of a sample (a subset) from a population (the full set). In our everyday lives we all draw samples without realising it. For instance, when one decides to taste some unfamiliar food or drink that is some form of sampling. Williams (2003 74) posits that “Sampling is a search for typicality). On the other hand, (Clark: 2006 87) defines sampling as “a process of drawing a number of individual cases from a larger population”. According to (Chiromo: 2006 16), “a sample is a smaller group or subset of the population”.