Composite and Mutual Link Prediction using SVM in Social Networks

Composite and Mutual Link Prediction using SVM in Social Networks

Length: 3055 words (8.7 double-spaced pages)

Rating: Excellent

Open Document

Essay Preview

More ↓

Link prediction is a key technique in many applications in social networks; where potential links between entities need to be predicted. Typical link prediction techniques deal with either uniform entities, i.e., company to company, applicant to applicant links, or non-mutual relationships, e.g., company to applicant links. However, there is a challenging problem of link prediction among the composite entities and mutual links; such as accurate prediction of matches on company dataset, jobs or workers on employment websites, where the links are mutually determined by both entities that composite entity belong to disjoint groups. The causes of interactions in these domains makes composite and mutual link prediction significantly different from the typical version of the problem. This work addresses these issues by proposing the Support Vector Machine model. By implementing the proposed algorithm it is expected that the accuracy will get increased in the link prediction problem.

Key words: Link prediction, Potential links, Composite, Mutual links, Support Vector Machine.

A social networking service is an online service, platform, or site that target to facilitating the building of social networks or social relations between people who, for example, share importance's, activities, backgrounds, or real-life communications. A social network service consists of a representation of each user, his/her social links, and a variety of new services. This is used to model the interaction among the communities on the social networks. Where the graphs are used to represent the interactions between those communities, in which nodes are representing to people in some communities and links are representing the association between those people.
Understanding the association between two specific nodes by predicting the likelihood of a future but not currently existing association between them is a fundamental problem known as link prediction.

Interaction on the social network involves both positive and negative relationships, e.g., since attempts to establish a relationship may fail due to decline from the expected target. This generates links that signify rejection of invitations, disapproval of applications, or expression of disagreement with others' opinions. Such social networks are mutual since the sign of a link indicating whether it is positive or negative depends on the attitudes or belief of both entities forming the link. Moreover, mutual positive and negative relationships have been even less investigated.

Fig.1: Collaborative Information for Composite and Mutual Link Prediction

Interaction Predicted Preference Predicted match

Recently, social network analysis has had a variety of applications, such as online dating sites, education admission portals as well as jobs, employment, career and hiring sites, where people in the networks have different roles and links between them can only between people in different roles.

How to Cite this Page

MLA Citation:
"Composite and Mutual Link Prediction using SVM in Social Networks." 19 Apr 2019

Need Writing Help?

Get feedback on grammar, clarity, concision and logic instantly.

Check your paper »

Essay on Numerical Study of Tensile Properties of POM/ Woven Kenaf Composite.

- Numerical Study of Tensile Properties of POM/ Woven Kenaf Composite. CHAPTER 1: INTRODUCTION 1.1 Background Study The geometry of woven fiber reinforced polymer like Polyoxymethylene (POM) and Woven Kenaf composite is quite complex. Therefore, to predict tensile properties of such composite requires numerical method. This project focuses on the use of The Laminator software and understanding the theories on Maximum Stress, Maximum Strain, Tsai-Hill, Hoffman, and Tsai-Wu failure to produces the required results....   [tags: woven fiber composite, geometry, ]

Research Papers
1555 words (4.4 pages)

Evaluating The Performance Of Mutual Funds Essays

- EVALUTATION OF MUTUAL FUNDS: In India, currently, there square measure many of the mutual funds are jointly investment companies operating each within the public sector yet as within the private sector sector. These contend with alternative one another for mobilizing the investment with individual investors and other organizations appetent of inserting their funds with these non-depository financial institution investment company would really like to grasp the comparative performance of every thus on choose the simplest mutual fund or investment company....   [tags: Investment, Mutual fund, Hedge fund]

Research Papers
909 words (2.6 pages)

Essay about The Discovery And Widespread Usage Of Composite Materials

- The discovery and widespread usage of composite materials has opened new doors in the manufacturing industry. Composites are materials composed of two or more different materials. The new material has distinct and superior structural properties than the individual component materials. The most ancient composite is adobe, which consists of straw reinforced clay and was commonly used as a building material [1]. Modern composites consist of a woven fiber, which reinforces a plastic polymer shell. Two major examples of modern composites are fiberglass and carbon fiber....   [tags: Carbon fiber, Composite material, Boeing 787]

Research Papers
1602 words (4.6 pages)

Benefits Of Using A Social Bot Essay

- Social Bot What is a social bot. To put it plain and simple a social bot is a piece of software that is designed to appear human that will interact with people on social networks or a web site. For a successful social bot to perform users need to think that they are interacting with an actual human. A social bot will typically provide short simple information and determine if it needs to be redirected to an actual support personnel. The biggest concern many users have with social bots is the question whether or not they are ethical to use (Socialbot, n.d.)....   [tags: Social media, Sociology, Customer, Human]

Research Papers
1276 words (3.6 pages)

Using Socially Responsible Investing When Deciding on an Investment Essay

- ... As the universe of investment is reduced, investors will benefit less from the potential for diversification than in an unconstrained portfolio which will result in lower risk-adjusted returns. Furthermore, the additional costs of monitoring social performance might also cause socially responsible funds to underperform. While proponents of SRI agree that the application of SRI screening criteria results in a reduction of investment opportunities, they argue that the loss of portfolio efficiency is offset because the unsustainable practices of excluded companies will make them less profitable over time....   [tags: mutual funds, market, ethical]

Research Papers
1978 words (5.7 pages)

Problems of Using Social Networks Essay

- In today’s society, billions of people across the world are accessing the Internet multiple times a day. Many people have the Internet on their smart phones, and with the touch of a button can check their email, Facebook, Twitter, their bank account balance, movie times, GPS, Instagram, or anything else online. Most of these things take up large amounts of time from people’s lives. Whether it is a child at school or an adult at work, they are all a member of some sort of social network and check it often....   [tags: lazy, overweight, less productive, anti-social]

Research Papers
906 words (2.6 pages)

The Management Of The Lic Mutual Fund Essay

- The LIC Mutual Fund was set up by the LIC of India in June, 1989 with the objective of mobilizing savings of people especially from rural and semi-urban areas and improving their income ensuring at the same time safety, liquidity and security of their investments. The Fund continued to work towards realizing these objectives. The Assets under Management (AUM) of LIC Nomura MF AMC Ltd. stood at Rs. 6112.16 crores as on 31st March, 2013 with a market share of 0.86%. Mutual Fund Industry‟ AUM as on 31st March, 2013 stood at Rs....   [tags: Investment, Bond, Mutual fund, Hedge fund]

Research Papers
937 words (2.7 pages)

Corporate Fund : A Mutual Fund Essay

- A mutual fund is an investment vehicle in which investor’s pool their resources together to invest in multiple different debt and equity instruments. They are operated by mutual fund managers who attempt to create a diversified portfolio, while still trying to achieve capital gains on the assets in the fund, paying close attention to the investors’ appetites for risk. The goal of a mutual fund is to pool multiple investor’s funds together so they can be invested at a level higher than any one investor could achieve on their own, with the end game being returns that can be enjoyed by all in a “mutual” benefit system....   [tags: Investment, Mutual fund, Hedge fund, Finance]

Research Papers
765 words (2.2 pages)

The Behavior Of Mutual Fund Essay

- The Behavior of Mutual Fund Managers and Investors Executive Summary The behaviors of mutual fund managers and the investors have been the subject of study in behavioral finance. The objective of the study is to determine whether there exists any relationship in the behaviors of the two parties to the mutual fund engagement. The study is based on the prospective theory and uses the cross-sectional research design to meet two research purposes. The two research purpose include to investigate whether there is granger causality between the behavior of the fund managers and the behavior of the investors and secondly, to investigate whether the relationship between the two variables can be used t...   [tags: Mutual fund, Investment, Finance, Hedge fund]

Research Papers
709 words (2 pages)

Mutual Fund : A Fund Essay

- A „Mutual Fund‟ is an investment vehicle for investors who pool their savings for investment in diversified portfolio of securities with the aim of attractive yields and appreciation in their value. The investment managers of the funds manage these savings in such a way that the risk is minimized and steady return is ensured. As per Mutual Fund Book, published by Investment Company Institute of the U.S., “A Mutual Fund is a financial service organization that receives money from shareholders, invests it, earns returns on it, attempts to make it grow and agrees to pay the shareholder cash on demand for the current value of his investment.” 90 Securities and Exchange Board of India (Mutual...   [tags: Investment, Mutual fund, Bond, Hedge fund]

Research Papers
1554 words (4.4 pages)

Such networks are composite, creating challenges for link prediction since existing approaches focus only on uniform networks where nodes in the networks have the same role and any of them may link to any other.
In this work, we propose a framework of composite and mutual link prediction on the network. Problem of the link prediction addresses the sign of links among the composite entities in the network. We address this problem and it is resolved by machine learning and to construct the structural feature for the machine learning. First we are going to creating the structure of Tetrad, which consists of graph with four nodes in which establish three paths among these nodes on the network based on set of collaborative filtering. Let us consider the first node is company that is the source node and fourth node is target node that is applicant. In between the second node is similar node of applicant and third one is similar node of company. The sign of the link (positive or negative) established between these tetrad paths structures and predicting links that should be satisfied whole interaction among the three path structure to recommending the target node of applicant to the source node of the company. Interest and reliability of these nodes (Company and Applicant) on the graph represent the collaborative information among the communities on the network. Temporality of feature is constructed based on the nodes (people) behavior on network structure. Each node on the network has taken unique time value for making the interactions among the nodes. Recency and Activeness of the nodes behaviors are measured on the network. Finally the properties of these features define the reliability of nodes on the network.

Hasan, Chaoji et al.,(2006)have proposed the link prediction to find the interaction between the nodes in the dataset. The data in different analysis applications such as social networks, web analysis, and collaborative filtering consists of relationships among the communities, which can be considered as links, between people. For instance, two people may be linked to each other if they exchange their taste and their lifestyles.

The aim is to recommend items that match the taste (likes or dislikes) of users in order to assist the active user, addressed by Cain, Bain et al., (2010) traditional collaborative filtering approach. the user who will receive recommendations, to select items from an overwhelming set of choices. Such systems have many uses in purchasing sites, subscription based services and other online applications, where provision of personalized suggestions is required.

Qiu, Yen et al.,(2011) defining temporal metric statistics by combining traditional statistical measures with measures commonly employed in financial analysis and traditional social network analysis. To use time series to describe node behavior, calculate temporal features from the time series to characterize behavior evolution, and use the temporal features to improve link prediction.
In order to facilitate accurate predictions and explore the different factors that drive link creation we explore the use of three feature sets: social, topical and visibility. Rowe, Stankovic et al., (2012) have proposed the Logistic Regression, high entropy model, Random model. Predicting follower edges within a directed social network by graphs and thereby significantly outperforming models.
In this work, we consider the composite and mutual link prediction problem. We propose a framework for addressing this link prediction problem for machine learning to understanding the concept of structural feature for learning. Nodes behavior on the network makes the construction of feature for learning based on collaborative filtering. Machine Learning algorithms to measure the accuracy and precision of the link prediction problem for varies applications of the social networks. These predicted measures varies based on the method that we are going to use such as k-N Nearest, decision tree, logistic regression which has constructed on the existing system and Menon, Elkan(2011) have proposed the support vector machine to calculate the better accuracy of link prediction problem.

Existing System

Link prediction is defined as the inference of new interactions among the members of a given social network. Given a directed graph G = (V,E) with a sign (positive or negative) on each Edge, we let s(u, v) denote the sign of the edge (u, v) from node u to node v. That is, s(u, v) = 1 when the sign of (u, v)) is positive, -1 when negative. For different formulations of our task, we suppose that for a particular edge (u, v), the sign s(u, v) is hidden and that we trying to infer it.
Monadic Features, In the social networks, the liveliness of an entity have impact on the behavior of the entity4. First defining the monadic features based on the degree of the outgoing edges, as well as degree of incoming edges. An outgoing edge of a node v is an edge that directs from v to another node. The degree of outgoing edges of a node (do (v)) is the number of outgoing edges from that node v. The degree of positive outgoing edges of a node (d+o (v)) is the number of outgoing edges from that node v with positive sign. Similarly we have to apply this to negative outgoing edge d−o (v) and also for degree of incoming edge with positive and negative sign such as d+i (v), d−i (v). The four monadic features d+o (v), d−o (v), d+i (v) and d−i (v) will be used in our method to represent the general attitude of an entity in a network.

Dyadic Features, We also define dyadic features based on collaborative information. We make use of collaborative information for link prediction and extract dyadic features as in collaborative filtering4. For example, in company to applicant's recommendation system, a link only exists between a composite pair, i.e. a company type and applicant type. Therefore, we consider a three step path involving both nodes within a future link, which is defined as a tetrad.

A tetrad t(u, sv, su, v) or t(u, v) is a three step path among four different nodes in a graph, where the source node is u (sender) and target node is v (recipient). A tetrad t(u, v) captures a two step relationship across two types, which is the minimum indirect path between a pair of nodes (u, v).
s +,-

r s
+,- +,-

Fig. 2: Tetrad Structure

The typical mutual collaborative filtering for people to people recommendation with preferences of positive edge sign4, 5 . In the Fig 2, ua is the source node, ur is the target node, ustr is similar node for ur and usaa is similar nodes node for ua . Where S and s mean a link from a precursor to a successor, R and r mean a link is to a precursor from a successor, and their signs, where + means a positive link and - negative link. To give an example, n(r+s-r+) means the total number of the type of tetrad t(u, sv, su, v) that have a positive link from u to sv, a negative link to sv from su and a positive link from su to v.
Similarly, add one more new type of collaborative information to the sender based inverted collaborative filtering to capture the preference of the recipient as shown in Fig 2. Which allow both positive and negative signs for interaction. Features based on inverted collaborative information are summarized in Table1 connoted by RRS, RSS, SRR and SSR.

The first set of features to capture the recipient's preference and another set of features to capture the sender's preference. We again allow both positive and negative signs in any interaction within the configuration. Features based on preference transmission are connoted by SSS and RRR .

Table 1. Dyadic Features Based on Mutual Collaborative Information.


r+s+r+ s+r+s+ r+s+s+ s+r+r+ s+s+r+ r+r+s+ s+s+s+ r+r+r+
r+s+r- s+r+s- r+s+s- s+r+r- s+s+r- r+r+s- s+s+s- r+r+r-
r+s-r+ s+r-s+ r+s-s+ s+r-r+ s+s-r+ r+r-s+ s+s-s+ r+r-r+
r+s-r- s+r-s- r+s-s- s+r-r- s+s-r- r+r-s- s+s-s- r+r-r-
r-s+r+ s-r+s+ r-s+s+ s-r+r+ s-s+r+ r-r+s+ s-s+s+ r-r+r+
r-s+r- s-r+s- r-s+s- s-r+r- s-s+r- r-r+s- s-s+s- r-r+r-
r-s-r+ s-r-s+ r-s-s+ s-r-r+ s-s-r+ r-r-s+ s-s-s+ r-r-r+
r-s-r- s-r-s- r-s-s- s-r-r- s-s-r- r-r-s- s-s-s- r-r-r-

To Learning and Testing the predict links4, first calculate the feature values and then calculate a measure of combined feature strength as the weighted combination of feature values, as follows:


Where s is the combined feature strength, xi the value of the ith feature and wi the weight value for xi. To learn the weights and convert this combined feature strength into an edge sign prediction, we use logistic regression, which will output a value in the range of (0, 1) representing the probability of a positive edge sign:


where p is the predicted probability of an positive edge sign. The instances are then classified into positive or negative according to the thresholding of the probability value
with respect to a threshold.

Proposed System

Qiu, Yen et al.,(2011) have proposed the temporal feature5. Defining temporal metric statistics by combining traditional statistical measures with measures commonly employed in financial analysis and traditional social network analysis. These metrics are estimated over time for a sequence of sociograms. It has shown that some of the temporal extensions of traditional metrics increase the accuracy of link prediction problem.
Temporal features, to use time series to describe node behavior, calculate temporal features from the time series to characterize behavior evolution, and use the temporal features to improve link prediction5. Simple Statistics: This type of feature includes simple first-order temporal features such as Recency and Activeness. Recency measures the length of time elapsed since a node made its last connection. Activeness measures the number of connections made by a node in the latest time step.
Activeness indicates that a node is very active in the last time step and is likely to be active in the future that shown on Fig 4.

Recency and Activeness are calculated by making window with 20 time stamp values in fig 3. In this timing window, we are going to calculating the number of occurrences in the entire time stamp for all the nodes. Suppose a particular node with the same time

Fig.3 : Construction of features to our algorithms.

stamp values is occur means that will be added in the corresponding time stamp value on the timing window. Then we are going to calculate the total number of occurrences in timing window for each of the node on the data set. More inactive node (Recency) is found and shown in the fig 4. as red color for corresponding the node.
Like which. Activeness of the node on the network is calculated by total occurrences of time stamp value is divided by timing window value 20 for the entire node on the dataset. A maximum connection that means node with maximum average value is considered for more active node on the network that is shown in fig 4. as red color for corresponding the node.

The temporal features to characterize the evolution of node behavior, and our experimental results suggest that including these temporal features significantly improve link prediction performance.

Support Vector Machine, Menon and Elkan (2011), Support vector machine (SVM) is one of the machine learning algorithms.
Given some training data D, a set of n points of the form,
where the yi is either 0 or 1, indicating the class to which the point xi belongs. , xi the value of the ith feature and wi the weight value for xi.
SVMs are a set of related supervised learning methods used for classification and regression techniques. Given a set of training examples, each considered as belonging to one of two categories, an SVM training algorithm constructs a model that predicts whether a positive link into one category or the other. The algorithm learns a classification model from set of previously labeled (pre-classified) data, and then applies the acquired knowledge to classify the links into two classes: positive links and negative links6.

Fig. 4 : Activeness and Recency of nodes from the dataset
Predicted positive links are represented as 1 and the negative links are as 0 on the features construction model from fig 3. In this maximum number of tuples will be considered for the training dataset that have been given as input into the Support Vector Machine classifier and other data samples will be considered for testing samples. Matrix of data points with each row corresponding to a support vector in the normalized data space.

This matrix is a subset of the Training input data matrix, after normalization has been applied according to the 'Auto Scale' argument. The sign of the weight is positive for support vectors belonging to the first group, and negative for the second group in Fig3.
Train a support vector machine, then call the trained machine to classify (predict) new data. In addition, to capture the satisfactory predictive accuracy, we can use different types SVM kernel functions, and we must tune the parameters of the kernel functions. Try different parameters for training, and check via cross validation to predict the best parameters. After obtaining a reasonable initial parameter, we might want to refine our parameters to obtain better accuracy.

Training the SVM with specified kernel parameters and algorithm. The parameters are,

*param train
-Where parameter train is a training set.
*param p
-Where parameter p is a Kern parameters.
*param alg
-Where parameter alg is a Chosen algorithm.

Then the svmtrain function to train the training samples with above the parameters for various cross validations(10 fold cross validation that we use). Applying SMO algorithm to pass the two parameters problem and kernel parameters for training the samples and also taken test samples to test the dataset to measures the accuracy. the same process will be repeated for different size of the training dataset along with the testing sample dataset to measured the corresponding accuracy values as Sensitivity and Specificity shown on the table 2. Sensitivity or True Positive Rate(TPR) is measured and as follows
where TP is True Positive and FN represents False Negative such that the Specificity or True Negative Rate is measured and as follows
where TN is True Negative and FP represents False Positive.


The company dataset is used for the required output. The datasets were collected from a commercial social network site containing interactions between users. It contains more than 10000 records and four attributes(From node, To node, Sign and Time) where the nodes are users and Link sign represents either positive sign link or negative sign link then finally corresponding link established time factor. The following tables, which shows Accuracy measures for different sampling of datasets.

Table 2. Accuracy measures With train dataset Training dataset LR-accuracy(%) SVM-accuracy(%)
1 720 records 74.4 80.67
2 840 records 73.99 79.26
3 980 records 70.54 77.81

For above the table results, the graphical representation is drawn below. In this graph, X-axis represents Accuracy and Y-axis represents Performance to shown that the accuracy measures between the Logistic Regression (LR) and Support Vector Machine (SVM) methods.

Fig 5. LR vs SVM

For the collected data set, evaluation of the logistic regression method gives the 72.97% of accuracy overall the experiment conducted that shown on the table 2. To compare with the logistic regression method the Support Vector Machine method gives the 79.24% of accuracy overall the experiment conducted by us on table 2. On the Accuracy of the proposed method outperforms the Logistic Regression method by 6.05%.


1. Cai .X, Bain .M, Krzywicki .A, Wobcke .W, Kim Y.S, Compton.P and Mahidadia. A "Learning Collaborative Filtering and Its Application to People to People Recommendation in Social Network", Proceedings of IEEE- International Conference on Data Mining, pp 743-748 (2010).

2. Hasan M.A, Chaoji .V,Salem .S and Zaki .M "Link Prediction using Supervised Learning" , Processing SDM 06 workshop on Link Analysis, Counterterrorism and Security, pp 394-415 (2006).

3. Leskovec .J, Huttenlocher .D and Kleinberg

.J "Predicting Positive and

Negative Links in Online Social Networks", Proceedings of the 19th International Conference on World Wide Web, pp 641-650 (2010).
4. Cai .X, Bain .M, Krzywicki .A, Wobcke.W and Kim Y.S "Reciprocal and Heterogeneous Link Prediction in Social networks", Proceedings of the 16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, Vol.II, pp 193-204 (2012).

5. Qiu.B, He.Q and Yen.J "Evolution of Node Behaviour in Link Prediction", Proceedings
of the Twenty-Fifth conference on Artificial Intelligence (2011).
6. Rowe .M, Stankovic .M and Alani.H , "Who will follow whom? Exploiting semantics for link prediction in attention-information networks", Proceedings of the 11th International Semantic Web Conference, pp 476-491 (2012).

7. Menon .A.K and Elkan .C "Link Prediction via Matrix Factorization", Proceeding of the European Conference on Machine Learning and Knowledge Discovery in Databases, Vol. II, pp 437-452 (2011).

8. Ouyang .T.Y "Leveraging Temporal Features for Link Prediction in Communication Networks", Massachusetts Institute of Technology Cambridge (2007).
9. Wang .C, Satuluri .V, Parthasarathy .S "Local Probabilistic Models for Link Prediction", Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, pp 322-331 (2007).

10. Hopcroft .J, Lou .T, Tang .J "Who Will Follow You Back? Mutual Relationship Prediction", Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp 1137-1146 (2011).

Return to