Wednesday, 27 February 2013

Networks of Marvel Heroes - INFOGRAPHIC

Masked vigilantes seem to be quite the socialites, when it comes to the company thy keep – or are they? Find out for yourself by taking a closer look at the graphic version of the Marvel Universe’s own social network. All characters with at least 100 mutual inked appearances are present, coloured according to their team or universe affiliations. For a more detailed analysis, follow the link, for all the fans who already know all there is to know: enjoy the visual feast!

Tuesday, 19 February 2013

Inaugural Networks

Presidential inaugural speeches in the US provide a good indication of the forthcoming political agenda. There has been a lot of research dedicated to this subject, however most of it focuses on keyword frequency analysis, which makes it difficult to trace the change in political agenda over the years. The reason is that the public political discourse is quite predictably dominated with such notions as “people”, “nation”, “world”. What’s interesting, however, is to detect the moments when the new notions are introduced into the political agenda, as well as to trace the change in relationships between the terms. This is where text network analysis can be quite useful, so Nodus Labs created a special report for The Guardian newspaper based on the US presidents inauguration speeches from Nixon’s 1969 to Obama’s 2013 address. 

The analysis used the method for text network analysis. The basic premise of this approach is that every word is represented as a node and their co-occurrence within the same context is represented as an edge in the network. After a series of transformations (performed by Textexture software developed by Nodus Labs) the graph is produced, which is then aligned according to Force Atlas algorithm. The nodes (words) that are connected (co-occur within the same context) are pulled together, while the nodes that are not connected are pushed. The resulting aligned graph gives a very good representation of the major semantic fields present within the text. Furthermore, community detection algorithms are applied to the resulting network, sorting the nodes (words) into the different groups according to how interconnected they are to one another. Every community is represented with a different color. As a result, if two words co-occur often together inside the same text they will be positioned next to each other on the graph and also belong to the same community (and, thus, have the same color on the graph). These communities represent the topics inside the text. Finally, the nodes are ranked according to their betweenness centrality measure: the bigger the node, the more different communities it belongs to.

It’s worth noting that such approach is very different from so-called “tag clouds”. Tag clouds show the most frequently mentioned words and they rarely position these words according to their proximity within the text. Therefore, one can get a general idea of the vocabulary inside the text, but it’s very hard to have a sense of the meaning that is produced using this vocabulary. Text network visualization, on the other hand, emphasizes both the most frequently mentioned words, as well as the relationships between them, making it much easier to understand what the text is about. Furthermore, it can also detect the topics inside the text, making it a much more useful tool for improving text comprehension and providing a much more useable interface for text navigation.

Bush, 2001:

Bush,  2005:

Quite a generic agenda at first sight, however, Bush was the first one to introduce the notion of “time” and use it to motivate certain policies. It’s all about the Now: “In all of these ways, I will bring the values of our history to the care of our times.” Not surprising that the “story” is also such an important concept in his speech: it’s full of short stories. In 2005, after the re-election is over, Bush is running the second term, probably thanks to his emphasis on “freedom” and “liberty” – a trick that always worked in the US and that was successfully employed by Reagan in his second term (see above).

Obama, 2009:
Obama, 2013:

The master of rhetorics, Obama combines the best of his predecessors in this inauguration speech. No wonder the “word” has such high relevance in his speech – it refers to the moments Obama is quoting someone else. In 2013’s speech the “time” and “require” probably relates to the fact that Obama had to respond to all the criticism that something had to be done immediately about the state of US economy and politics – and he successfully addressed these concerns.


Friday, 15 February 2013

Connecting the Community

We all live in multiple on-line communities, but what do these communities look like? Where are we located in each of our communities, and what role do we play?

The diagram below shows an actual on-line community [OLC]. Every node in the network represents a person. A link between two nodes reveals a relationship or connection between two people in the community -- the social network. Most on-line communities consist of three social rings -- a densely connected core in the center, loosely connected fragments in the second ring, and an outer ring of disconnected nodes, commonly known as lurkers. Communities have various levels of belonging -- each represented by one of these rings. You may belong in the core of one community, while being a peripheral lurker in another.

In the above diagram, we see three distinct types of membership in our community -- designated by blue, green and red nodes. The proportion of nodes in each ring in this population is fairly typical of most on-line communities -- the isolates [lurkers] outnumber the highly-connected by a large ratio. The outer orbit in the network above contains the blue nodes. They have been attracted to the OLC, but have not connected yet. The blue nodes contain both brand new members, who have not connected yet, and passive members who have seen no reason to connect. The passive group is the most likely to leave the OLC, or remain as absorbers-only of the content in the community.

The green nodes have a few connections -- usually with prior acquaintances. They are not connected to the larger community -- only to a small, local group. They do not feel a sense of true membership in the larger whole, though they may identify with it. The small clusters of friendships amongst the greens can be maintained by other media and do not need this particular OLC to survive. They are also likely to leave or become passive and will likely do so in unison with the rest of their small circle of friends.

The inner core of the community is composed of red nodes [zoomed-in view below]. They are very involved in the community, and have formed a connected cluster of multiple overlapping ego networks. The leaders of the OLC are embedded in this core cluster. The core members will stay and build the community. Unfortunately they are in the minority. The core node consists of usually less than 10% of most on-line groups -- sometimes they are as few as 1% of the total OLC. Although small, they are a powerful force of attraction. It is the core that is committed and loyal to the OLC and will work on making it a success.

Online communities and social networks are often conceived and developed by businesses and organizations that focus on: "How can we use the online community to benefit us?" Focusing only on how to utilize the community, leads many organization to failure in building these communities! They fail at community development by not creating a strategy that makes sure their target audience is gaining a positive experience and practical benefits from participating in the community. It is amazing how many organizations try to build on-line social networks while ignoring the needs of the very people they are trying to attract and influence! It is then no surprise when large chunks of their target group leave when the "next big thing" comes around: SixDegrees-->Friendster-->Orkut-->MySpace-->Facebook-->Next? To build a vibrant and growing OLC, you need to support natural human behavior, not work against it. You need to think sociology, not just technology.

The field of social network analysis [SNA] gives us tools to both know the net and knit the net. SNA maps and measures the paths of information, ideas and influence in the community. SNA reveals the emergent patterns of interaction in organizations and communities and allows us to track their changes over time.Growing a community is not just adding new members. It requires adding both people and relationships -- nodes and links. Node counts are important in social networks, but it's the relationships -- and the patterns they create -- that are key! A community thrives by its connections, not by its collections! It's the relationships, and the prospect of future relationships, that keep members active and excited.

Wednesday, 13 February 2013

Manchester City vs Liverpool: Passing network analysis

At the beginning of February, Manchester City drew 2-2 with Liverpool at the Etihad, so a football loving blog decided to take a look at the match from a network point of view, resulting in the following research. We have already reported about something similar regarding basketball.

The positions of the players are loosely based on the formations played by the two teams, although some creative license is employed for clarity. It is important to note that these are fixed positions, which will not always be representative of where a player passed/received the ball. Only the starting eleven is shown on the pitch, as the substitutes weren’t hugely interesting from a passing perspective in this instance. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. The size and colour of the markers is relative to the players on their own team i.e. they are on different scales for each team.

In the reverse fixture, Yaya Touré and De Jong were very influential for City but Touré was away at the African Cup of Nations, while De Jong joined Milan shortly after that fixture. Their replacements in this game, Barry and Garcia, were less influential, although Barry had the strongest passing influence for City in this match, with Milner second. The central midfield two, Lucas and Gerrard, were very influential for Liverpool and strongly dictated the passing patterns of the team. They both linked well with the fullbacks and wider players, while Lucas also had strong links with Suárez and Sturridge. Certainly in this area of the pitch, Liverpool had the upper hand over City and this provided a solid base for Liverpool in the match.
Similarly to the Arsenal game, Liverpool showed less of an emphasis upon recycling the ball in deeper areas. Instead, they favoured moving the ball forward more directly, with Enrique often being an outlet for this via Reina and Agger. Liverpool’s fullbacks combined well with their respective wide-players, while also being strong options for Lucas and Gerrard. Strurridge was generally excellent in this match and was more influential in terms of passing than in his previous games against Norwich and Arsenal, combining well with Suárez, Lucas and Gerrard.
At least based on the past few games, Liverpool have shown the ability to alter their passing approach with a heavily possession orientated game against Norwich, followed up by more direct counter-attacking performances against Arsenal and Manchester City. The game against City was particularly impressive as this was mixed in with some good control in midfield via Lucas and Gerrard, which was absent against Arsenal. How this progresses during Liverpool’s next run of fixtures will be something to look out for.

Tuesday, 5 February 2013

Basketball Isn’t a Sport. It’s a Statistical Network

Team sports and statistics are no strangers, take sabermetrics, that revolutionized game analysis for baseball, while making it more fun to watch. The story might sound familiar if you have seen Moneyball, where Brad Pitt took on the role of Billy Beane, who pumped up the game of the Oakland A’s.

Compared to baseball, though, basketball is much more dynamic, and ball movement becomes a key variable in success. Passing is one of the fundamentals of hoops, and in the upper ranks of the sport, turnovers — often the result of wayward passes — contribute to ticks in the win-loss column. Fast, agile passing can make or break a team. That’s why sabermetrics might not tell the entire story about what happens on the court. Researchers at Arizona State University, led by life science professor and basketball fan Jennifer Fewell and math professor Dieter Armbruster found an ideal model to explain the results of the 2010 NBA playoffs by simply keeping their eye on the ball. Their work opens the door to an entirely new line of sports analysis, from game-tape breakdown to highlight reels and augmented-reality visualizations.

Their method - not surprisingly – was network analysis, which turns teammates into nodes and exchanges — passes — into paths. From there, they created a flowchart of sorts that showed ball movement, mapping game progression pass by pass: Every time one player sent the ball to another, the flowchart lines accumulated, creating larger and larger and arrows. Using data from the 2010 playoffs, Fewell and Armbruster’s team mapped the ball movement of every play. Using the most frequent transactions — the inbound pass to shot-on-basket — they analyzed the typical paths the ball took around the court.

Network analysis of the Chicago Bulls, showing the majority of ball interaction remained with the point guard. Image: 

Network analysis of the Los Angeles Lakers shows the team is far more likely to distribute the ball among more players, using the “triangle offense.” 

For most teams, the inbound pass went primarily to the point guard, generally a team’s best ball handler. But point guard-centric, such as the Bulls, didn’t fare well in the 2010 playoffs, the researchers told Wired. On the other hand, the Los Angeles Lakers — which won the 2010 NBA championship — distributed the ball more evenly than their rivals, embracing what Phil Jackson calls the “triangle offense,” a technique pioneered by Hall of Fame coach Sam Barry. The basic idea is simple: Maintain balanced court spacing so any player can pass to another at any point.In their model, Fewell and Armbruster found a mathematical explanation for why the triangle offense works — the point guard was no longer the only player feeding passes to fellow players; his teammates were just as likely to take on that role. With more potential passers, there are more potential paths for the opposition to defend.

To quantify their results, published in the journal PLOS ONE, the researchers derived the entropy, or measure of system disorder, for each team during each game. In six of the eight first rounds, winners had higher team entropy, and therefore more randomness, than losers. Though the sample size of teams in the NBA playoffs may be small, the data suggest a possible relationship between quick, unpredictable ball movement and success in games.

While fans direct cheers that fill sports arenas toward athletic giants such as LeBron James or Kobe Bryant, bright statisticians still sit in the shadows. But when these mathematical stars begin helping LeBron improve his game, it’s certain they’ll hear more and more of the applause.

Hungry for more? Read the full article on Wired.