1.1 Big Data
Big data[1] is defined as massive amount of data , which is difficult to process, capture, manage and analyze by traditional software techniques in reasonable time . These data can be complex, vast, diverse, heterogeneous in nature. Sources[2] of these data can be online transactions, video, emails, audio, images, logs, posts, tweets, search queries, science data, health records, sensor data, social media interactions.
In order to describe Big data, following are the properties[4][1] :-
1) Volume : Now, enterprises are awash with data. Data is in exabytes or zettabytes , rather than in tera or peta bytes.
2) Variety: Big Data comes from various sources, so data is not of single format. Basically, data is divided as - Structured, Unstructured and Semi structured. Structured data is the data which resides within fixed field . Semi structured data is tabular, relational, categorical or meta -data. Unstructured data is text , messages, tweets or posts.
3) Velocity: Velocity in Big data describes the speed at which data is created, retrieved from various sources, stored and analyzed. Data is streaming in a very unprecedented motion and should be dealt in timely manner.
4) Variability: This basically deals with inconsistencies of data flow during periodic peaks. Data loads are challengeable job to maintain during increase in usage ,that causes peak loads due to specific event triggered.
5) Veracity: This basically deals with quality and provenance of received data. Data can be characterized as good, bad, undefined, ambiguous or inconsistent[2].
1.1.1 Challenges of Big Data
• Growth of Data: With the advent in Technology and Science, we are inundated with data. Unstructured data in enterprises...
... middle of paper ...
...lysis. Sensor data is basically data at rest and data in motion. So Huge, massive data needs to be analyzed for safety and efficiency purpose. [4]
4) Social Media : Big data is most in use for social media and maintaining customer relationship[10]. Analyzing customers reviews about product, help business organizations to understand their market reputation and competitors. Analyzing a large record of users accessing sites. Analyzing and calculating number of sites accessed by different users, gives an idea of which sites gets maximum hits ,helpful to upload advertisement having maximum hits by users.
5) Risk Analysis : Financials organizations needs to process large amount of data in order to calculate risks. There is large amount of data which is still underutilized, needs adequate amount of processing and integration to be done to analyze risks patterns .[4]
For example, let’s say that you are trading in the stock market. There are thousands of companies and millions of customers.The volume of the amount of customer’s information and holdings is vast. The rate and frequency to how quickly that data is generated, we called this velocity. For example, The New York Stock Exchange deals with “1 TB of information during each trading session.” (page 1 McKinsey Global Institute) This terabyte of information is generated in less than a minute. Velocity deals with how fast the data is being generated. The next category we use to describe Big Data is variety. For the most part, the variety of Big Data can come from either traditional formats such as hospitals or non-traditional such as social media format as Twitter. Big
The characteristics of this unstructured data are high in volume, high velocity, or high variety and complexity. Big data comes from sensors, devices, video/audio, networks, log files, transactional applications, web, and social media - much of it generated in real time and in a very large scale.
Consideration of advances in big data technology has shown that it has potential to enhance the government’s analysis capability in areas such citizen-centric service delivery. It is evident that big data also provides insights into social networks and relationships as well as allowing for the development of predictive models for a number of applications. Of interest more broadly to agencies, big data analysis may provide profound insights into a number of key areas of society including health care, medical and other sciences, transport and infrastructure, education, communication, meteorology and social
There is no proper definition of big data but after reading literature, The definition of big data tends to refer to the use of behavioral analytics and predictive analytics or other advanced data analytics methods to extract value from data.[1]
8.) Data - means facts or information. People use data as a basis for drawing conclusions about the topic or theme they are studying.
You may ask what big data analytics is. Well according to SAS, the leading company in business analytics software and services describes big data analytics as “the process of examining big data to uncover hidden patterns, unknown correlations and other useful information that can be used to make better decisions.” As the goal of many companies which is to seek insights into the massive amount of structured, unstructured, and binary data at their disposal to improve business decisions and outcomes, it is evident why big data analytics is a big deal. “Big data differs from traditional data gathering due to that it captures, manages, and processes the data with low-latency. It also one or more of the listed characteristics: high volume, high velocity, or high variety. Big data comes from sensors, devices, video/audio, networks, log files, web, and social media which much of it is generated in real time and in a very large scale.”(IBM) In other words, companies moving towards big data analytics are able to see faster results but it continues to reach exceptional levels moving faster than the average person can maintain.
In 2001 , Doug Laney analyst firm META Group ( now the research company Gartner ) has said that the challenges and opportunities in data growth can be described in three dimensions : increased the amount ( volume ) , increasing the speed ( velocity ) and increased in variety ( variety ) . Now Gartner , along with many other companies and organizations in the field of information technology continue to use the " 3V " to be defined Big Data . By 2012 , Gartner added that apart from Big Data on the remaining three properties to " require new forms of treatment to help to make decisions , to explore deep into things / events and optimize the workflow " .
Currently the world has a wealth of data, stored all over the planet (the Internet and Web are prime examples), but it is needed to be understand that data. It has been stated that the amount of data doubles approximately
Digital Forensics is a topic that is becoming increasingly important in computing and often requires the intelligent analysis of large amounts of complex data extracted from crime scene. Digital forensics investigation is the process of extracting, analysis digital evidence for use as admissible proof about committed crimes. This process can help to reconstruct the crime events. Big Data analytics (BDA) is a process of analyzing Big Data. The objective of Big Data Analytics is to extract knowledge patterns from massive volume of input data.
Effectively, big data provides the companies with the opportunity to know and understand their customers. With Big Data, a company holds a lot of information about their customers, such as their habits, what is the product they prefer, the time they make purchases, how many products they buy and what kind of promotion their prefer. Then they can make a relation with who the person is, age, gender. Thus they can adapt their offer to the customer. If you are a good client, you can have private discount, or more discount. The objective is to know personally your customer to make them feeling unique.
The last decade can be marked as a period of significant changes in the business world. Being accustomed to utilize computers as a powerful tool with its office applications such as Microsoft Word and Excel. In the 1990s office workers first faced the opportunity to share information using the Internet (McNurlin, 2009). However, the situation became even more different with the transition to the third millennium. With a further development of information technologies, the majority of big enterprises had to reconstitute their business processes and to make the transition to the Internet economy. Enterprise resource planning (ERP), supply-chain management (SCM), customer relationship management (CRM) software and the variety of other information systems became essential components of the new economy. It can be expected, that all these complex solutions were designed to bring great benefits for different sides of the corporate activity, in particular, decisions made by top-managers are expected to become nearer to the ideal, customer service is to be improved and collaboration more prolific. Nevertheless, to ensure the desired results it should be taken into account that the key concept of these reorganizations is an information or a data, dealing with which can be a serious issue, and wide utilizing of the data warehouses in contemporary organizations confirms this fact.
Cloud computing is a type of computing that depends on sharing computing resources rather than having local servers or personal device to handle applications.
Information privacy, or data privacy is the relationship between distribution of data, technology, the public expectation of privacy, and the legal and political issues surrounding them.
Adopting big data can also help the banking industry by saving them from lots of embarrassment resulting from increase in the number of customer which in turn requires banks to improve on their performance. As stated earlier banks are entrusted with lots of information and this information must be safe will be required to be accessed ready and in a timely fashion. The use a normal small database will not be enough to perform this operation and if banks don’t embrace the use of big data they might start to experience failure in there system.