Monday 27 May 2013

Is football really a simple game?! The hidden networks behind Bayern's success!



The infographic was created by Avalanche. CLICK FOR FULL SIZE

With the power of network visualization, dynamics of football games can be understood better than ever. Maven7’s analyst team is a huge fan of sports (check out our last analysis about the chances of the Hungarian water-polo team at London Olympics), especially football. 

As everybody knows it, "football is a simple game; 22 men chase a ball for 90 minutes and at the end, the Germans always win". So then why do so many people admire this simple form of entertainment? Why do dozens of analysts try to predict who will win a certain game or championship? Why is betting a huge business? The answer is as simple as football, because this game is not simple at all! Behind every pass, attack and goal, human dynamics have a strong impact. Network Analysis can give a new approach to understanding team dynamics during football games. 

Our recent infographic shows the hidden networks of two finalists of Champions League’s 2013. Let’s face the big question; can network science provide the answer why Bayern won and not Dortmund? 

If you look at the pictures, similarities and differences are easily noticeable. Network structures and patterns resemble each other because of the same line-up structure. Two defenders (greens) had strong mutual pass connections at both teams, but Dortmund focused on the right and Bayern on the left back. Teams have preferred defensive midfielders - Schweinsteiger and Gündogan, they were the top choice to pass to in midfield. OK, so both teams are German and both have same line-ups, but what isthe difference then?

Why did Bayern win?

Dortmund’s midfielder, Reus was the preferred player to pass to from the attacking midfielders. The penalty that Dortmund received also came from a situation after a pass to Reus. 

At the attacking midfield, Bayern is more active on the wings, and their whole network is not that centralized as Dortmund’s. Bayern’s midfield played in a better cooperation; their network shows more mutual connections, and Ribery’s supportive role on the left wing makes the whole attacking part very successful. Unfortunately, Dortmund’s attacking midfield has no mutual connection, and the whole midfield has only one as well. In comparison; Bayern’s attacking midfield has mutual connection between Robben and Ribery, and the midfield also has 3 mutual connections (Schweinsteiger - Ribery, Müller – Robben, Ribery – Martinez), which may show stronger cohesion in the midfield. 

Also, the midfield players’ performance of the two teams indicates their teams’ performance. Schweinsteiger played and passed more actively and punctual (87 tries, 73 times successful – 84%) than Gündonan (56 tries, 31 times successful – 62%), and while Bayern had altogether 640 passes and their efficiency was 72%, Dortmund had only 448 passes with 60% efficiency. 

An interesting fact is, that those attacks, which started from the goalkeeper, are more likely happening by the players of Dortmund. In general, Dortmund’s defense played a more attacking role; while Dante passed mostly to the back, Boateng passed to the front. 

Monday 18 March 2013

The Harlem Shake Story - aka. Birth of a Meme

If you still have not heard of the Harlem Shake you must be living in a cave. Much has been written about the rapid and global spread of this catchy internet meme, yet little is understood about how it spread. A series of remixed videos along with a number of key communities around the world triggered a rapid escalation, giving the meme widespread global visibility. Who were the initial communities behind this mega-trend? SocialFlow took a look at 1.9 million tweets during a two-week period that included the words ’harlem shake’, or some versions of it.

The Harlem Shake itself is a dance style born in New York City more than 30 years ago. During halftime at street ball games held in Rucker Park, a skinny man known in the neighborhood as Al. B. would entertain the crowd with his own brand of moves, a dance that around Harlem became known as 'The Al. B. Though it started in 1981, the Harlem Shake became mainstream in 2001 when G. Dep featured the dance in his music video "Let's GetIt". While mining Twitter data, references to Harlem Shake (the original dance) were seen quite often prior to it becoming a popular meme. When someone tweets, "I just passed my final exams! *harlem shakes*," it's the equivalent of saying "I just passed my final exams! Look at me dancing!" While Bauuer's now infamous track was released on Diplo's Mad Decent label back in August 2012 (posted to YouTube on August 23 2012), it only accrued minor visibility for the first few months. Then February hit, and something changed.

The timeline below highlights the very first days as the meme was taking off. In blue, we see references to the 1980's dance *harlem shakes*, while the green curve represents Tweets that use the phrase 'The Harlem Shake', many of them linking to one of the first three versions of the meme on YouTube.

On February 2, The Sunny Coast Skate (TSCS) group establish the form of the meme in a YouTube video they upload. On the 5, PHL_On_NAN posts a remix (v2), gaining 300,000 views within 24 hours, and prompting further parodies shortly after. On Feb. 7, YouTuber hiimrawn uploaded a version titled "Harlem Shake v3 (office edition)" featuring the staff of online video production company Maker Studios. The video becomes is a hit, amassing more than 7.4 million views over the following week, and inspiring a number of contributions from well-known Internet companies, including BuzzFeed, CollegeHumor, Vimeo and Facebook.



Social Flow looked at the social connections amongst users who were posting to the meme. This gave them the ability to identify the underlying communities engaging with the meme at a very early stage. In the graph above each node represents a user that was actively posting and referencing the Harlem Shake meme on Feb 7 or 8 to Twitter. Connections between users reflect follow/friendship relationships. The graph is organized using a force directed algorithm, and colored based on modularity, highlighting dominant clusters - regions in the graph which are much more interconnected. These clusters represent groups of users who tend to have some attribute in common. The purple region in the graph (left side) represents African American Twitter users who are referencing Harlem Shake in its original context. There's very little density there as it is not really a tight-knit community, but rather a segment of users who are culturally aligned, and are clearly much more interconnected amongst themselves than with other groups.



After a similar analysis on the following two days (Feb 9 and 10) different communities can be seen emerging, resulting in a much more tightly knit graph structure. While the same dense cluster of musicians and DJs (in turquoise) still exists, there are substantially more self-identified YouTubers both across the US and the UK. At the same time there's a significant gamer / machinima cluster that's also participating, as well as a growing Jamaican contingent, and quite a few dutch profiles (purple -- left). Additionally, we see various celebrity and media accounts who caught on to the meme -- @jimmyfallon, @mashable and @huffingtonpost. By capturing the two snapshots, we can also make sense of the evolution of the meme as it becomes more and more visible. At first, loosely connected communities separately humored by the videos. Within days, we see major media outlets jump on board, and a much more intertwined landscape. We see different regions in the world light up, and identify communities of important YouTube enthusiasts who effectively get this content to spread.



Memes have become a sort of distributed mass spectacle, a mechanism that both capture people's attention, and define what is "cool" or "trendy." We see more and more companies and brands try to associate themselves with certain memes, as a way to maintain a connection with their audience, gain the cool factor. Pepsi did this with the Harlem Shake and saw an incredibly positive response. 


As we get better at identifying these trends and trend-setting communities early on, the pressure to participate will rise. As social networks become globally-intertwined, we're witnessing a growing number of memes conquer the world at large. These moments are critical points in time, where there are significant levels of attention given towards a specific entity - be it a joke, funny video or a political topic. Piecing together data from social networks can help us identify critical points in time, as well as the underlying communities and trendsetters for the humor-based memes, or the agenda setters for politically-slanted ones. The only question is: what will be the next one, cashing in on it 15 minutes?

Hungry for more? Read the full article on HuffPost.

Tuesday 5 March 2013

Networking autism

A new American study, using network analysis may help in understanding some classic behaviors in autism.

A look at how the brain processes information finds a distinct pattern in children with autism spectrum disorders. Using EEGs to track the brain’s electrical cross-talk, researchers from Boston Children’s Hospital have found a structural difference in brain connections. Compared with neurotypical children, those with autism have multiple redundant connections between neighboring brain areas at the expense of long-distance links.



Peters, Taquet and senior authors Simon Warfield, PhD, of the Computational Radiology Laboratory and Mustafa Sahin, MD, PhD, of Neurology, analyzed EEG recordings from two groups of autistic children: 16 children with classic autism, and 14 children whose autism is part of a genetic syndrome known as tuberous sclerosis complex (TSC). They compared these readings with EEGs from two control groups—46 healthy neurotypical children and 29 children with TSC but not autism. In both groups with autism, there were more short-range connections within different brain region, but fewer connections linking far-flung areas. A brain network that favors short-range over long-range connections seems to be consistent with autism’s classic cognitive profile—a child who excels at specific, focused tasks like memorizing streets, but who cannot integrate information across different brain areas into higher-order concepts. For example, a child with autism may not understand why a face looks really angry, because his visual brain centers and emotional brain centers have less cross-talk. The brain cannot integrate these areas. It’s doing a lot with the information locally, but it’s not sending it out to the rest of the brain.

The most popular autistic character of the silver screen is Raymond Babbitt from Rain Man, a true savant with amazing memory and mathematical skills, but an incapability to change adaption. A cinematic fun fact: the real life human being and inspiration to his character Kim Peek was suffering from another disorder than autism.

Network analysis—a hot emerging branch of cognitive neuroscience—showed a quality called “resilience” in the children with autism—the ability to find multiple ways to get from point A to point B through redundant pathways. Much like you can still travel from Boston to Brussels even if London Heathrow is shut down, by going through New York’s JFK airport for example, information can continue to be transferred between two regions of the brain of children with autism. In such a network, no hub plays a specific role, and traffic may flow along many redundant routes. It’s a simpler, less specialized network that’s more rigid, less able to respond to stimulation from the environment.

 
Do we have your curiosity and your attention? Read more on Psypost.

Wednesday 27 February 2013

Networks of Marvel Heroes - INFOGRAPHIC

Masked vigilantes seem to be quite the socialites, when it comes to the company thy keep – or are they? Find out for yourself by taking a closer look at the graphic version of the Marvel Universe’s own social network. All characters with at least 100 mutual inked appearances are present, coloured according to their team or universe affiliations. For a more detailed analysis, follow the link, for all the fans who already know all there is to know: enjoy the visual feast!


Tuesday 19 February 2013

Inaugural Networks

Presidential inaugural speeches in the US provide a good indication of the forthcoming political agenda. There has been a lot of research dedicated to this subject, however most of it focuses on keyword frequency analysis, which makes it difficult to trace the change in political agenda over the years. The reason is that the public political discourse is quite predictably dominated with such notions as “people”, “nation”, “world”. What’s interesting, however, is to detect the moments when the new notions are introduced into the political agenda, as well as to trace the change in relationships between the terms. This is where text network analysis can be quite useful, so Nodus Labs created a special report for The Guardian newspaper based on the US presidents inauguration speeches from Nixon’s 1969 to Obama’s 2013 address. 

The analysis used the method for text network analysis. The basic premise of this approach is that every word is represented as a node and their co-occurrence within the same context is represented as an edge in the network. After a series of transformations (performed by Textexture software developed by Nodus Labs) the graph is produced, which is then aligned according to Force Atlas algorithm. The nodes (words) that are connected (co-occur within the same context) are pulled together, while the nodes that are not connected are pushed. The resulting aligned graph gives a very good representation of the major semantic fields present within the text. Furthermore, community detection algorithms are applied to the resulting network, sorting the nodes (words) into the different groups according to how interconnected they are to one another. Every community is represented with a different color. As a result, if two words co-occur often together inside the same text they will be positioned next to each other on the graph and also belong to the same community (and, thus, have the same color on the graph). These communities represent the topics inside the text. Finally, the nodes are ranked according to their betweenness centrality measure: the bigger the node, the more different communities it belongs to.

It’s worth noting that such approach is very different from so-called “tag clouds”. Tag clouds show the most frequently mentioned words and they rarely position these words according to their proximity within the text. Therefore, one can get a general idea of the vocabulary inside the text, but it’s very hard to have a sense of the meaning that is produced using this vocabulary. Text network visualization, on the other hand, emphasizes both the most frequently mentioned words, as well as the relationships between them, making it much easier to understand what the text is about. Furthermore, it can also detect the topics inside the text, making it a much more useful tool for improving text comprehension and providing a much more useable interface for text navigation.

Bush, 2001:

Bush,  2005:


Quite a generic agenda at first sight, however, Bush was the first one to introduce the notion of “time” and use it to motivate certain policies. It’s all about the Now: “In all of these ways, I will bring the values of our history to the care of our times.” Not surprising that the “story” is also such an important concept in his speech: it’s full of short stories. In 2005, after the re-election is over, Bush is running the second term, probably thanks to his emphasis on “freedom” and “liberty” – a trick that always worked in the US and that was successfully employed by Reagan in his second term (see above).

Obama, 2009:
Obama, 2013:



The master of rhetorics, Obama combines the best of his predecessors in this inauguration speech. No wonder the “word” has such high relevance in his speech – it refers to the moments Obama is quoting someone else. In 2013’s speech the “time” and “require” probably relates to the fact that Obama had to respond to all the criticism that something had to be done immediately about the state of US economy and politics – and he successfully addressed these concerns.

Source:

Friday 15 February 2013

Connecting the Community


We all live in multiple on-line communities, but what do these communities look like? Where are we located in each of our communities, and what role do we play?

The diagram below shows an actual on-line community [OLC]. Every node in the network represents a person. A link between two nodes reveals a relationship or connection between two people in the community -- the social network. Most on-line communities consist of three social rings -- a densely connected core in the center, loosely connected fragments in the second ring, and an outer ring of disconnected nodes, commonly known as lurkers. Communities have various levels of belonging -- each represented by one of these rings. You may belong in the core of one community, while being a peripheral lurker in another.



In the above diagram, we see three distinct types of membership in our community -- designated by blue, green and red nodes. The proportion of nodes in each ring in this population is fairly typical of most on-line communities -- the isolates [lurkers] outnumber the highly-connected by a large ratio. The outer orbit in the network above contains the blue nodes. They have been attracted to the OLC, but have not connected yet. The blue nodes contain both brand new members, who have not connected yet, and passive members who have seen no reason to connect. The passive group is the most likely to leave the OLC, or remain as absorbers-only of the content in the community.

The green nodes have a few connections -- usually with prior acquaintances. They are not connected to the larger community -- only to a small, local group. They do not feel a sense of true membership in the larger whole, though they may identify with it. The small clusters of friendships amongst the greens can be maintained by other media and do not need this particular OLC to survive. They are also likely to leave or become passive and will likely do so in unison with the rest of their small circle of friends.



The inner core of the community is composed of red nodes [zoomed-in view below]. They are very involved in the community, and have formed a connected cluster of multiple overlapping ego networks. The leaders of the OLC are embedded in this core cluster. The core members will stay and build the community. Unfortunately they are in the minority. The core node consists of usually less than 10% of most on-line groups -- sometimes they are as few as 1% of the total OLC. Although small, they are a powerful force of attraction. It is the core that is committed and loyal to the OLC and will work on making it a success.

Online communities and social networks are often conceived and developed by businesses and organizations that focus on: "How can we use the online community to benefit us?" Focusing only on how to utilize the community, leads many organization to failure in building these communities! They fail at community development by not creating a strategy that makes sure their target audience is gaining a positive experience and practical benefits from participating in the community. It is amazing how many organizations try to build on-line social networks while ignoring the needs of the very people they are trying to attract and influence! It is then no surprise when large chunks of their target group leave when the "next big thing" comes around: SixDegrees-->Friendster-->Orkut-->MySpace-->Facebook-->Next? To build a vibrant and growing OLC, you need to support natural human behavior, not work against it. You need to think sociology, not just technology.

The field of social network analysis [SNA] gives us tools to both know the net and knit the net. SNA maps and measures the paths of information, ideas and influence in the community. SNA reveals the emergent patterns of interaction in organizations and communities and allows us to track their changes over time.Growing a community is not just adding new members. It requires adding both people and relationships -- nodes and links. Node counts are important in social networks, but it's the relationships -- and the patterns they create -- that are key! A community thrives by its connections, not by its collections! It's the relationships, and the prospect of future relationships, that keep members active and excited.

Wednesday 13 February 2013

Manchester City vs Liverpool: Passing network analysis


At the beginning of February, Manchester City drew 2-2 with Liverpool at the Etihad, so a football loving blog decided to take a look at the match from a network point of view, resulting in the following research. We have already reported about something similar regarding basketball.



The positions of the players are loosely based on the formations played by the two teams, although some creative license is employed for clarity. It is important to note that these are fixed positions, which will not always be representative of where a player passed/received the ball. Only the starting eleven is shown on the pitch, as the substitutes weren’t hugely interesting from a passing perspective in this instance. Only completed passes are shown. Darker and thicker arrows indicate more passes between each player. The player markers are sized according to their passing influence, the larger the marker, the greater their influence. The size and colour of the markers is relative to the players on their own team i.e. they are on different scales for each team.

In the reverse fixture, Yaya Touré and De Jong were very influential for City but Touré was away at the African Cup of Nations, while De Jong joined Milan shortly after that fixture. Their replacements in this game, Barry and Garcia, were less influential, although Barry had the strongest passing influence for City in this match, with Milner second. The central midfield two, Lucas and Gerrard, were very influential for Liverpool and strongly dictated the passing patterns of the team. They both linked well with the fullbacks and wider players, while Lucas also had strong links with Suárez and Sturridge. Certainly in this area of the pitch, Liverpool had the upper hand over City and this provided a solid base for Liverpool in the match.
Similarly to the Arsenal game, Liverpool showed less of an emphasis upon recycling the ball in deeper areas. Instead, they favoured moving the ball forward more directly, with Enrique often being an outlet for this via Reina and Agger. Liverpool’s fullbacks combined well with their respective wide-players, while also being strong options for Lucas and Gerrard. Strurridge was generally excellent in this match and was more influential in terms of passing than in his previous games against Norwich and Arsenal, combining well with Suárez, Lucas and Gerrard.
At least based on the past few games, Liverpool have shown the ability to alter their passing approach with a heavily possession orientated game against Norwich, followed up by more direct counter-attacking performances against Arsenal and Manchester City. The game against City was particularly impressive as this was mixed in with some good control in midfield via Lucas and Gerrard, which was absent against Arsenal. How this progresses during Liverpool’s next run of fixtures will be something to look out for.


Tuesday 5 February 2013

Basketball Isn’t a Sport. It’s a Statistical Network


Team sports and statistics are no strangers, take sabermetrics, that revolutionized game analysis for baseball, while making it more fun to watch. The story might sound familiar if you have seen Moneyball, where Brad Pitt took on the role of Billy Beane, who pumped up the game of the Oakland A’s.

Compared to baseball, though, basketball is much more dynamic, and ball movement becomes a key variable in success. Passing is one of the fundamentals of hoops, and in the upper ranks of the sport, turnovers — often the result of wayward passes — contribute to ticks in the win-loss column. Fast, agile passing can make or break a team. That’s why sabermetrics might not tell the entire story about what happens on the court. Researchers at Arizona State University, led by life science professor and basketball fan Jennifer Fewell and math professor Dieter Armbruster found an ideal model to explain the results of the 2010 NBA playoffs by simply keeping their eye on the ball. Their work opens the door to an entirely new line of sports analysis, from game-tape breakdown to highlight reels and augmented-reality visualizations.

Their method - not surprisingly – was network analysis, which turns teammates into nodes and exchanges — passes — into paths. From there, they created a flowchart of sorts that showed ball movement, mapping game progression pass by pass: Every time one player sent the ball to another, the flowchart lines accumulated, creating larger and larger and arrows. Using data from the 2010 playoffs, Fewell and Armbruster’s team mapped the ball movement of every play. Using the most frequent transactions — the inbound pass to shot-on-basket — they analyzed the typical paths the ball took around the court.



Network analysis of the Chicago Bulls, showing the majority of ball interaction remained with the point guard. Image: 


Network analysis of the Los Angeles Lakers shows the team is far more likely to distribute the ball among more players, using the “triangle offense.” 

For most teams, the inbound pass went primarily to the point guard, generally a team’s best ball handler. But point guard-centric, such as the Bulls, didn’t fare well in the 2010 playoffs, the researchers told Wired. On the other hand, the Los Angeles Lakers — which won the 2010 NBA championship — distributed the ball more evenly than their rivals, embracing what Phil Jackson calls the “triangle offense,” a technique pioneered by Hall of Fame coach Sam Barry. The basic idea is simple: Maintain balanced court spacing so any player can pass to another at any point.In their model, Fewell and Armbruster found a mathematical explanation for why the triangle offense works — the point guard was no longer the only player feeding passes to fellow players; his teammates were just as likely to take on that role. With more potential passers, there are more potential paths for the opposition to defend.

To quantify their results, published in the journal PLOS ONE, the researchers derived the entropy, or measure of system disorder, for each team during each game. In six of the eight first rounds, winners had higher team entropy, and therefore more randomness, than losers. Though the sample size of teams in the NBA playoffs may be small, the data suggest a possible relationship between quick, unpredictable ball movement and success in games.

While fans direct cheers that fill sports arenas toward athletic giants such as LeBron James or Kobe Bryant, bright statisticians still sit in the shadows. But when these mathematical stars begin helping LeBron improve his game, it’s certain they’ll hear more and more of the applause.

Hungry for more? Read the full article on Wired.

Monday 14 January 2013

FirmnetOnline training for Consultants


13 NGOs applied for Maven7’s probono organizational development training, giving them an opportunity, to be consulted by professionals on networking and structural matters. We would like to take the opportunity to thank all the candidates, and congratualte the final three: Greenpeace Hungary, Blue Point Drug Consultation Center and Ambulance and Transparency International Hungary.

During this Firmnet Online training, participants recieved a full network analysis from goal specifications to result analysis. On the first meeting representatives of the third sector and the trainers discussed the problems each of them is facing. By the end of the training, these consultants will be able tor un a full analysis from beginning to end all by themselves.

For more about FirmnetOnline and training opportunities follow this link. Stay tuned for further details on the NGO program.


Zsolt Szegfalvi (Greenpeace) with the consultants of the training.

Thursday 3 January 2013

Social Media and the Power of Networks 2. – Key Opinion Leaders on Twitter


The increasing impact of social media gives modern marketing a lot to think about; Facebook, Twitter, Tumblr, Flickr, Pinterest, Google+ and hundreds of blogs are only the tip of the iceberg, and it seems impossible to be up-to-date on all the channels. To look at them one by one seems illogical, since the key aspect of the generated content lays in the network effect, that enables the vast exchange of information. What remains to be done? This three-part series introduces Maven7’s newest research focusing on the network effect, and therefore making life easier for online marketing, PR, and product management experts.
In contrast to the Facebook-boom that began 2-3 years ago, and reached it’s 3 million user population in Hungary last year, the Twitter community seems to be growing at a slower pace. The Twitter company was launched in 2006 in San Franscisco, and has around 30 thousand Hungarian visitors a day, similar to the blog hosting site Tumblr.
Why bother with them at all – you may ask? The majority of Twitter and Tumblr users come from an urban environment, most of them are high-status people living in Budapest. Microblogs spread information – especially negative ones – very fast. Here is a comparison: a „tradiotional” online medium might be busy with a story for a whole week, whereas on Twitter – given that the right person spreads it – the same information is distributed within 2.5 hours! Therefore it is of great importance, to keep these outlets under control as much as possible. It is not a coincidence, that Hollywood celebrities like Charlie Sheen (with his 7.5  million followers) get paid around 50thousand dollars per tweet. Our survey conducted during Spanish election season showed that even an average person can have substantial effect on voters. This leaves no second thoughts about monitoring the information that gets to these loyal, high presitge consumers.
National key opinion leaders (famous journalists, bloggers, athletes) are active on multiple scial media platforms, but the small number of follower bases point to the fact, that the person with the most followers is not neccesary the most influental one, when it comes to information distribution. We need to find out, which tweeter is the most relevant one, and has the power to form opinions when it comes to our products. We can achive this through Twitter data using the methods of data mining. The user’s position in the network is another key factor (i.e. how many followers does the user have in common with our competing brand). Compared to Twitter, Facebook has open activity data, which means that we can easily access information regarding the users network of contacts.


Social Media and the Power of Networks 2. – Key Opinion Leaders
Social Media and the Power of Networks 2. – Key Opinion Leaders


There are multiple ways we can build networks from the connections of Twitter users. First of all we can regard the distributors (people related to the brand,  or the brand’s official page) as the source of information, and link individual users to them, based on who retweeted the source’s message. Furthermore, the users themselves have followers and friends online, the latter one representing a stronger status, that can be interpreted as a network itself (
for more, check our previous article on a follower- andfriend-based network). The picture shows a network of retweeted messages related to an FMCG product distributor and its competitors.
Social Media and the Power of Networks 2. – Key Opinion Leaders on Twitter
Social Media and the Power of Networks 2. – Key Opinion Leaders on Twitter pic 2.
The second picture represents the choice between data sources, that have the most influence on our consumer basis. The yellow boxes are the key opinion leaders(KOLs), who can reach out to the major part of the community in only three steps. They hold a central position in the network, because they have the biggest follower- and friendbasis.
Through analysis of Twitter data we can not only locate the key opinion leaders and characters of a brand, but with the help of location information we can also interpret product placement related research. A good example of using location data is our previous article on the optimallocalization af ATMs. 

To be continued.