Plotting correlations matrices in R Package

Correlations matrices give us relevant information, but sometimes it is hard to visualize them.

R Package provides several functions to plot correlations matrices. Following there is a very simple but effective solution for plotting a correlation matrix using package “corrplot”.

library(corrplot);

data <- array(data=runif(100, min=-1, max=1)
 , dim=c(10,10)
 , dimnames=list(paste("A", 1:10, sep=""), paste("B", 1:10, sep="")));

corrplot(data, type="full", order="hclust", tl.col="black", tl.srt=45)

And this is the plot:

test

Positive correlations are displayed in blue and negative correlations in red color. Color intensity and the size of the circle are proportional to the correlation coefficients. The legend color shows the correlation coefficients and the corresponding colors.

Further examples can be found at http://www.sthda.com/english/wiki/correlation-matrix-a-quick-start-guide-to-analyze-format-and-visualize-a-correlation-matrix-using-r-software

Posted in Data Mining | Tagged , , | Leave a comment

MDAI 2015 (Modeling Decisions for Artificial Intelligence)

Slides of our paper:

J. Casas-Roma (2015). An Evaluation of Edge Modification Techniques for Privacy-Preserving on Graphs. Proceedings of the 12th International Conference on Modeling Decisions for Artificial Intelligence, volume 9321 of Lecture Notes in Computer Science, pp. 180-191. 2015. Springer International Publishing. doi:10.1007/978-3-319-23240-9_15.

MDAI2015-jcasasr-presentation

Posted in Artificial Intelligence, Computer Security, Graph Mining, Social network analysis | Leave a comment

SoMeRis 2015 (social media and risk)

Slides of our paper:

J. Casas-Roma and F. Rousseau (2015). Community-preserving generalization of social networks. Proceedings of the social media and risk ASONAM 2015 workshop (SoMeRis ’15)

SoMeRis2015-jcasasr-presentation

Posted in Data Mining, Graph Mining, Privacy-preserving, Social network analysis | Leave a comment

Plotting the coreness of a network with R and igraph

Briefly, the k-core of a graph corresponds to the maximal connected subgraph whose vertices are at least of degree k within the subgraph. It is an interesting tool to analyze the connectivity of a network, and it is used in several domains, such as clustering, community discovery and anonymity.

However, it can be quite useful to visualize the coreness structure of small or medium networks in order to better understand the concept. To the best of our knowledge, there is no layout in igraph to properly visualize this concept. Although the code is very simple, we will show it here with a minimum example.

CorenessLayout <- function(g) {
coreness <- graph.coreness(g);
xy <- array(NA, dim=c(length(coreness), 2));

shells <- sort(unique(coreness));
for(shell in shells) {
v <- 1 - ((shell-1) / max(shells));
nodes_in_shell <- sum(coreness==shell);
angles <- seq(0,360,(360/nodes_in_shell));
angles <- angles[-length(angles)]; # remove last element
xy[coreness==shell, 1] <- sin(angles) * v;
xy[coreness==shell, 2] <- cos(angles) * v;
}
return(xy);
}

# g is the network
# compute coreness
coreness <- graph.coreness(g);
# assign colors
colbar <- rainbow(max(coreness));
# create layout
ll <- CorenessLayout(g);
# plot
plot(g, layout=ll, vertex.size=15, vertex.color=colbar[coreness], vertex.frame.color=colbar[coreness], main='Coreness');

Result looks like this one (Karate network):

karate

Posted in Data Mining, Graph Mining | Tagged , , | 1 Comment

Finding rellevant links in a network

In the following paper , the authors proposed a new method to compute the most important edges (or links) in a network:

Grady, D., Thiemann, C., & Brockmann, D. (2012). Robust classification of salient links in complex networks. Nature Communications, 3(May), 864:1–864:10. doi:10.1038/ncomms1847

This method automatically finds “salient” links in networks by using the following approach:

  1. Choose a property that defines the weight of an edge in a given network,
  2. Define the distance between two nodes as 1 / the weight,
  3. Using above distance measure, take a node and compute the shortest path to all other nodes in the network,
  4. Combine all edges from the previous step into a set, which is called the Shortest Path Tree (SPT),
  5. For each edge in the SPT, increase a “salience” counter,
  6. Repeat steps 3-5 for every other node in the network,
  7. For each edge, divide its salience counter by the total number of nodes in the network, leading to a salience property with a value in the range [0..1].

You can find an interesting application example in:

Jonatan Samoocha. Finding Important Connections In A Network – Automatically. http://blog.xebia.com/2013/01/21/finding-important-connections-in-a-network-automatically/

Posted in Graph Mining, Information flow, Social network analysis | Tagged , , , , , | Leave a comment