In this chapter we are going to provide more insight into the Naïve Bayes algorithm. The aim is to show how the method works. We will also take a look at how our model will be developed, the various data sets that will be used in the process and how they were chosen. Then we are going to look at feature selection and how it will be applied.
THE NAÏVE BAYES CLASSIFIER
P (E | H) x P (H)
P (H | E) = _________________
The fundamental concept of Bayes' rule is that the result of a hypothesis or an event (H) can be calculated based on the presence of some observed evidences (E). From Bayes' rule, we have:
1. A prior probability of H or P(H): This is the probability of an event before observing the evidence.
2. A posterior probability of H or P(H | E): This is the probability of an event after observing the evidence.
For example to estimate the probability of a mail being classified as belonging to the Human Resources (HR) class, we usually use some evidences such as the frequency of use of words like “Employment”.
Using the equation above, let ‘HR’ be the event of a mail belonging to HR and ‘Employment’ be the evidence of the word Employment in the mail, then we have
P (Employment | HR) x P (HR)
P (HR | Employment) = _____________________
P (HR | Employment) is the probability that the word Employment occurs in a mail to HR. Of course, “Employment” could occur in many other mail classes such as Joint Venture or Procurement and Contracting, but we only consider “Employment” in the context of class “HR”. This probability can be obtained from historical mail collections.
P (HR) is the prior probability of the HR class. This probability can be estimated from r...
... middle of paper ...
...st results. Because no information about the test set was used in developing the classifier, the results of this experiment should be indicative of actual performance in practice.
It is highly important to not look at the test data while developing the classifier method and to run systems on it as sparingly as possible. Ignoring or violating this rule will result in loss of validity of your results because you have implicitly tuned your system to the test data simply by running many variant systems and keeping the tweaks to the system that worked best on the test set.
In this chapter we have been able to describe what the Naïve Bayes theory is and how we were able to build the classifier. In the next chapter we will take a closer look at the training set and test set. We will also carry out an evaluation of the classifier we developed.
Need Writing Help?
Get feedback on grammar, clarity, concision and logic instantly.Check your paper »
- An algorithm, according to the Random House Unabridged Dictionary, is a set of rules for solving a problem in a finite number of steps. One of the fundamental problems of computer science is sorting a set of items. The solutions to these problems are known as sorting algorithms and rather ironically, “the process of applying an algorithm to an input to obtain an output is called a computation” [http://mathworld.wolfram.com/Algorithm.html]. The quest to develop the most memory efficient and the fastest sorting algorithm has become one of the great mathematical challenges of the last half century, resulting in many tried and tested algorithms available to the individual who needs to sort a lis... [tags: Computer Science Algorithm]
1117 words (3.2 pages)
- Unified Land Operations defines the army operational design methodology (ADM) as “a methodology for applying critical and creative thinking to understand, visualize, and describe unfamiliar problems and approaches to solving them. The operational design methodology incorporated into army doctrine serves as a method to compliment the military decision making process (MDMP). Although the ADM it is often confused with replacing MDMP, its purpose is to address complex problems from a nonlinear approach.... [tags: Army Operational Design Methodology]
1232 words (3.5 pages)
- The ID3 Algorithm Abstract This paper details the ID3 classification algorithm. Very simply, ID3 builds a decision tree from a fixed set of examples. The resulting tree is used to classify future samples. The example has several attributes and belongs to a class (like yes or no). The leaf nodes of the decision tree contain the class name whereas a non-leaf node is a decision node. The decision node is an attribute test with each branch (to another decision tree) being a possible value of the attribute.... [tags: Classification Algorithms]
1344 words (3.8 pages)
- Abstract—Computational problems have significance from the early civilizations. These problems and solutions are used for the study of universe. Numbers and symbols have been used for different fields e.g. mathematics, statistics. After the emergence of computers the number and objects needs to be arranged in a particular order i.e. ascending and descending orders. The ordering of these numbers is generally referred to as sorting. Sorting gained a lot of importance in computer sciences and its applications are in file systems etc.... [tags: computers]
1532 words (4.4 pages)
- Pierre-Simon Laplace was born on March 23, 1749 in France (Pierre-Simon Laplace, 2000). He was a mathematician and astronomer who made great findings that contributed to mathematical astronomy and probability (Pierre-Simon Laplace, 2000). Not much is known about Laplace’s childhood because he rarely ever talked about his early days (Marquis de laplace, 2013). However, it is known that his family was middle-class and rich neighbors paid for him to attend school when they realized how talented the boy was (Pierre-Simon Laplace, 2000).... [tags: mathematical astronomy, bayes theorem]
948 words (2.7 pages)
- In order to achieve a reasonable evaluation of direct trust, this paper proposes a trust evaluation algorithm based on the domain, using the technique of constructing a hierarchical tree of trust evaluation subjectively. The algorithm adopts the rules of series and parallel operations in the D-S theory, acquires the results of the recommended trust problem of a single path by quadrature methods and implements the integration of multiple paths by the weighted algorithm which takes the cooperative roles and industry roles as factors.... [tags: algorithm, manufacturing industry, ]
1633 words (4.7 pages)
- Bayes' Theorem I first became interested in Bayes' Theorem after reading Blind Man's Bluff, Sontag (1998). The book made mention how Bayes' Theorem was used to locate a missing thermonuclear bomb in Spain in 1966. Furthermore, it was again used by the military to locate the missing submarine USS Scorpion (Sontag, pg. 97) that had imploded when it sank several years later. I was intrigued by the nature of the theory and wanted to know more about it. When I was reading our textbook for the class, I came across Bayes' Theorem again, and found an avenue to do more research.... [tags: Papers]
3823 words (10.9 pages)
- Rejection of Naïve Realism Naive realism is the way the majority of people are aware of their world, and is based on the assumption that what they are seeing and experiencing is real; that they understand their world through knowledge gained from these experiences. However, certain arguments reject that this world is as familiar as it seems, presenting a need for revision of the naive realist's concept of reality. We discard the idea that the senses offer a reliable window on the world and come to state that our perception is simply a veil that cloaks reality.... [tags: Papers]
617 words (1.8 pages)
- Gawain, noble or naïve. Gawain, nephew of the famed Arthur of the Round Table, is depicted as the most noble of knights in the poem Sir Gawain and the Green Knight. Nonetheless, he is not without fault or demerit, and is certainly susceptible to conflict. Gawain, bound to chivalry, is torn between his knightly edicts, his courtly obligations, and his mortal thoughts of self-preservation. This conflict is most evident in his failure of the tests posed by the wicked Morgan le Fay. With devious tests of temptation and courage, Morgan is able to create a mockery of the courtly and knightly ideal, through Gawain's failure of these tests.... [tags: Essays Papers]
1074 words (3.1 pages)
- This paper proposes an efficient and scalable multicast algorithm that accommodates dynamic groups. Our protocol relies on a shared tree architecture to deal with the problems of scalability and group dynamics. Our algorithm is based on the communication model developed by Bhat et al  that considers both network and node heterogeneity. Our algorithm uses a modified version Bhat et al.  heuristics for multicasting a message to the group. M I. INTRODUCTION any applications such as teleconferencing, distributed games and any collaborative multimedia application require an efficient group communication.... [tags: Group Communication Software Technology]
1702 words (4.9 pages)