Why do some online communities succeed while others fail? One of today’s biggest challenges is figuring out online user engagement. Rather than comparing average website visitor numbers, the really important question is how many visitors become regulars or even addicts – and can’t live without your service. Understanding and improving user engagement is key – not only does it enable viral growth of a community, but it also helps to better reap valuable monetization benefits from more effective online advertisement.
This blog post is based on a paper I wrote in Sarah Szalavitz‘s class on Social Design at the MIT Media Lab. I start off by taking a quick look at the fundamentals of communities, decision making, and game mechanics. In order to compare engagement levels on community websites, I will create a metric based on Quantcast traffic frequency data. I will then look at the engagement levels on 22 different online communities and derive some key differences. Furthermore, I will conduct an in-depth analysis of two Q&A sites, Stack Overflow and Answers.com, using a simple game mechanics framework.
If you are interested in reading the whole paper, feel free to ping me.
How do Communities Work?
In order to better understand how user engagement can effectively work in online communities, one first has to understand the basic mechanics of communities. Christopher Allen, wrote a comprehensive blog post in 2009 illustrating the various types of communities in which we co-exist:
- Support Circle (3 to 5 people): consists of people one is closest to and would “seek advice, support, or help from in times of severe emotional or financial stress”
- Sympathy Circle (7 to 20 people): this circle is “often made up of kin, but usually includes some peers as well”
- Trust Circle (40 to 200 people): there is “some type of intimate connection” to people within this circle
- Emotional Circle (around 300 people): often also referred to as “weak ties”
- Familiar Strangers (>1,000 people): people we have very infrequent interactions or random encounters with
I will come back to the concepts of weak and strong ties and social network size in the section on how to influence people. An interesting experiment was published in the paper “Social Network Size in Humans” where Granovetter tried to estimate the average social network size in an experiment based on the exchange of Christmas cards. It turned out that the maximum network size averaged 153.5 individuals. Granovetter identified the following determining factors:
- Passive factors (for example physical distance, work colleagues, overseas)
- Active factors (emotional closeness, genetic relatedness)
The study suggests the existence of cognitive constraints on network size.
Individual’s Involvement in Groups
“In online communities, […] participation inequality power rule is very apt”, closely tied to the 90-9-1 principle: “1% of people create content, 9% edit or modify that content, and 90% view the content without contributing.” is what Wikipedia has to say about the 1% rule.
Exhibit 1: Community Participation Rule of Thumb
Michael Wu has done empirical research (10 years of user contribution data, 200+ online communities) that shows that the rule is a great rule of thumb, but depends on the setup and type of community.
How to Influence People?
There has been a tremendous amount of research into how people make decisions and how these decisions can be influenced (I suggest the following books: How We Decide, Jonah Lehrer, Houghton Mifflin, 2009, Why We Buy, Paco Underhill, Simon & Schuster, 2000, Your Brain is (Almost) Perfect, Read Montague, Plume, 2007, Predictably Irrational, Dan Ariely, Harper Perennial, 2010).
What is most relevant to this paper is the question of how people can be influenced to engage in an online community, with a particular focus on the effect of the social design component.
With respect to this discussion, there seem to be two schools of thought: (1) Influencing through powerful hubs, frequently called Influencers or Mavens, and (2) Influencing through the masses, in a decentralized way.
Exhibit 3: Influencers versus Masses
Source: Fast Company Article, Is the Tipping Point Toast
Clive Thomson has recently asked in Fast Company: Is the Tipping Point toast?- challenging the theory people like Malcolm Gladwell and Ed Keller are supporting, namely that key influencers are critical to spreading news, influencing community. Central to their argument is a study done by Milgram (famously known as the Milgram small world experiment) which, apart from showing support for the 6-degrees of separation theory, also shows that key people (three friends, in the experiment) are responsible for the success of a letter delivery in a majority of the cases, leading Gladwell & Co. to conclude that these super-connectors are critical to the system.
Duncan Watts, on the other hand, repeated the small world experiment on a larger scale in an online-based experiment and found that super-connectors are not that important – only around 5% of traffic went through them. He concludes that not people but ideas matter: “If society is ready to embrace a trend, almost anyone can start one–and if it isn’t, then almost no one can”.
In Thomson’s article, Watt also compares trends to forest fires: “There are thousands a year, but only a few become roaring monsters. That’s because in those rare situations, the landscape was ripe: sparse rain, dry woods, badly equipped fire departments. If these conditions exist, any old match will do. “
More support comes from Paul Adams, the former social research lead in the UX team at Google who now works at Facebook, with a great presentation titled The Real Life Social Network. Adams says that “the role of influentials is over-estimated” and “whether someone can be influenced is as important as the strength of the influencer”. Whether someone can be influenced depends on “what their social network looks like” and “what they have experienced before”.
How do these influencers compare to the super-users identified in the previous section? It is not entirely clear whether these two terms are interchangeable. While super-users are characterized by their high level of activity, influencers can be identified by analyzing the activity within the social graph i.e. who is talking to whom in a given community. Influencers are those that shape opinions and influence others through communication. Therefore, they may or may not be super-users at the same time.
Understanding Game Mechanics
A lot of research has been conducted into social games, gamification and how businesses can use it. A great source for an initial overview is provided on Gamification.org, a wiki for game-related knowledge. The site lists the following game features as the key components when building a game:
- “Activity feed” (to show players what is going on)
- “Avatars” (unique representations of players)
- “Easter Eggs” (the fun stuff: include intentional hidden messages, in-joke)
- Instances (unique experiences outside the normal experience
- Leaderboards (track performance, compare to others – dangerous – don’t scare of newbees, tell stories – social leaderboards “you and your friends” – missions built in)
- Notifiers (feedback to users about progress, performance in game)
- User Profile (all data about one’s activity)
Amy Jo Jim provides a different way to think about gamification: In her presentation she points out four particular elements that build on each other:
- Exchanges Customization
In a different talk, she makes a very important distinction with respect to points. They can take three different forms and functions:
- Experience Points (earned directly for user’s actions)
- Skill Points (interacting with the system)
- Influence Points (assigned by other people)
Furthermore, to go into a little more detail, Gamification.org lists 24 different game mechanics, ranging from A like Achievement to V like Virality. The authors of the site have also identified 3 attributes:
- Type (Progression, Feedback, Behavioral)
- Boosts / Benefits (Engagement, Loyalty, Time spent, Influence, Fun, SEO, UGC, Virality)
- Personality Types (Explorers, Achievers, Socializers, Killers)
Obviously, the title of this paper has engagement in it, and this type is by far the most important benefit and brings with it most of the other benefit i.e. if Engagement is high, loyalty, virality, fun will typically be pretty high as well.
Looking at the similarities between the different frameworks proposed in recent literature as well as identifying the key factors that I believe will be most relevant to user engagement in the context of online communities, I have developed the following simplified framework for gamification:
Exhibit 4: Gamification Framework
Later on in this paper, when looking at the case studies of StackOverflow and Answers.com, I will use this framework to point out the key differences between the two sites.
As recently pointed out by Rick Webb in a talk in the Social Design lab at the MIT Media Lab, “engagement is very hard to measure”, there is currently no well-defined industry standard.
Here are some examples of the metrics, Google Analytics, the leader in tracking web site traffic and providing analytics, provides:
- Average page views per visitor
- Time spent on site
- Total time spent per user
- Recency and Frequency (including frequency of visits, time since last visit)
- Length and depth of visit
- Bounce Rate
These metrics can be helpful; however in today’s world of Flash and Ajax, they might not provide good information value.
In particular areas, such as online publishing, there are a few other useful internal and external social metrics that can help measuring engagement, as suggested by John Byrne, BusinessWeek’s online editor, in an interview with Eric Ulken:
- Internal Metrics
- Number of Comments
- Return commenters
- Number of times emailed
- External Metrics
- Diggs / Delicious saves
- Inbound links from blogs
EROC Engagement Ratio of Online Communities
Not only are there no clear metrics for online user engagement, also a lot of the internal analytics data is proprietary, which makes it difficult to compare online communities.
Quantcast, apart from providing website traffic rank data, also offers a break-down of traffic into three different user groups: addicts, regulars and passer-by. According to the website, addicts are the “hardcore segment of a site’s audience, who have 30 or more visits to that site in a month”, regulars are users that “frequent the site more than once per month but not as much as addicts” and passer-bys “have a single visit over the course of a month”.
In order to make the data easily comparable for different websites, I am constructing the following ratio:
EROC = [ 2 * Aa * Av + Ra * Rv ] / [ Pa * Pv ]
Aa = Addicts % of total audience
Av = Addicts % of total visits
Ra = Regulars % of total audience
Rv = Regulars % of total visits
Pa = Passer-By % of total audience
Pv = Passer-By % total visits
While somewhat arbitrary, EROC puts the following criteria into one simple number:
- Addicts are more valuable then Regulars (using factor 2, can be adjusted based on monetization of addicts versus regulars and more general business model and industry)
- Addicts and Regulars are more valuable than Passer-Bys
- Passer-Bys should not only have a small percentage of audience but also a small percentage of visits
The higher EROC, the better the user engagement of a particular community website. However, it needs to be noted, that best practice EROCs can differ by industry, as some communities have higher interaction frequency than others by nature
Data and Analysis
While the data is not available for every website, I was able to collect it for 22 community websites, grouping companies in the following segments:
- Social Networks I (defined by demographic focus)
- Social Networks II (defined by interest focus)
- Online Dating
- Online Video Content
- Online Q&A
- Stack Overflow
- Social Content Sharing
I have also included LinkedIn and Pandora as comparables.
The overall average for EROC is roughly around 1. A few sites stand out with very high EROCs: okcupid.com (2.97), Hulu (2.78), Tumblr (2.37). On the low end of the spectrum, there are metacafe.com (0.14) and answers.com (0.232).
A key take-away, as expected, is that there are severe differences in the EROC for the different categories. For example, online dating is 2.41 while for social networks II (interest focus) EROC is 0.67.
While certainly a small sample size, this could imply that online dating communities – based on the necessity to create profiles as well as people’s strong desire for interaction have a much higher percentage of users that are highly active. Within the online video content group, one can also see a difference which could be due to Hulu doing an amazing job in engaging users or just the fact that people watch television every day, but might only watch a movie once a month. Goodreads and rottentomatoes.com also have lower OCERs which again might be due to the fact that people consume at a much lower frequency.
In general, there might be different issues to consider when analyzing this type of data, such as user access via third party apps (e.g. in the case of Twitter, where a high percentage of users access the service via a third party app such as TweetDeck) or mobile app usage, which is not reflected in the data either.
Case study: Comparison of popular Q&A Sites
I decided to compare Stack Overflow and Answers.com as they are both in the same category of online Q&A, yet are very different from a social design perspective and also have very different engagement ratios:
Exhibit 5: Comparison of Stack Overflow and Answers.com
EROC = 1.288
EROC = 0.232
In the following sections, I will compare both online communities according to the gamification framework developed earlier.
I m a new user to both sites, so I hope I can bring an objective angle to the analysis, evaluating the sites from a first-time user’s perspective (which is an interesting perspective when trying to find out whether a site is able to easily engage new users).
I will briefly analyze the site along the key dimensions of my framework: User Identity, Actions, Social, Discovery, Feedback and Points.
While there is a box for leaving a small comment about oneself, most people share very little information. Also the key information tags name, member for, seen, website, location and age are a small number compared to most other social networks. There is a very strong focus on stats, in particular reputation – the biggest number displayed on the page is the reputation score directly below the person’s avatar.
Also, the user profile not only features the raw score, but also indicates that the user has placed in the “top 10% this week” – providing a feedback / ranking measure. What’s interesting here is the small time frame; it allows even new users to quickly rise up the ranks and ensures high frequency interaction with the website.
Further down on the profile page, the sites lists a user’s questions or questions he/she has been involved with. For each questions there are several stats: number of votes, number of answers and number of views.
Only way to build reputation is to ask and answer good questions (other people’s questions). There is a variety of actions that is tied to a minimum reputation score; i.e. new users have a limited set of actions and have to rise to a certain level to do certain actions (a commonly used game mechanic). A lot of these actions fall into the social category and will be discussed in the next section.
Here are some of the key actions:
- Create posts (no reputation required)
- Vote up
- Flag posts
- Comment everywhere
- Vote down
- Retag questions
As pointed out before, a lot of these actions are of a social nature, in line with Stack Overflow’s vision: “you earn reputation from your peers, you the community’s trust – and will be granted additional privileges on Stack Overflow”. Similar to Wikipedia, Stack Overflow is a collaboration site – users can edit other users’ answers, all changes are tracked. Furthermore, there’s a chat platform where users can meet and group around particular topics (see Appendix).
One thing that’s interesting to note is that there is no integration with Facebook and Twitter for spreading messages or actions, something that is very popular with other social sites such as Foursquare. I believe it is a conscious choice by Stack Overflow, due to the particular use of the site by its users.
There aren’t many surprise and or discovery elements, as far as I can tell (note: as a Stack Overflow novice, I might not have reached the level where these show up). One thing that could fall into this category though is the bounties. Users can also offer bounties for solving questions. They are effectively a certain fixed number reputation points which come out of their own account.
The feedback mechanisms are very clear and direct by having other people voting questions and answers up or down. In addition, the user that has asked a particular question can, after looking at all the answers, decided which one is most helpful and mark it as “accepted answer”. Furthermore, other users can suggest edits. Both of these actions are also rewarded with reputation points.
When describing game mechanics earlier, I pointed out three types of points systems: experience points, skill points, and influence points. Stack Overflow is a perfect example of mastering the points system, tapping into all categories, yet putting a very strong focus on influence points. As pointed out earlier, the reputation score impacts what people can do on the site (see screenshot on the right); for example, the 5% of “talk in chat” translates into 20 reputation points.
Stack Overflow has a relatively simple leaderboard and a large collection of badges (>60 different ones ranging from A like Altruist to Y like Yearling) categorized into three main categories:
- Bronze badge: “awarded for basic use” e.g. Commentator (someone who left 10 comments)
- Silver badge: “awarded for long term goals” e.g. Epic (earned at least 200 reputation on 50 days)
- Gold badge: “are rare”, “actively work towards these” e.g. Famous Question (someone who asked a question with 10,000 views)
For each badge, the number of people that currently hold that badge are shown. For example, there are currently more than 64,000 Commentators, more than 150 Epics and more than 6,000 Famous Question badges issued. Interesting to note, that while gold badges are supposed to be hardest to earn, some of them have a higher number than certain silver badges.
As done for Stack Overflow, I will again briefly analyze the site along the key dimensions of my framework: User Identity, Actions, Social, Discovery, Feedback and Points. As before, all the relevant screenshots can be found in the appendix.
A typical user profile has the username, gender, age and location in addition to some key stats:
- Trust points
- Contributions stats, including # of Answers, # of Edits, Organization, # of Questions, Community
I will talk in more detail about trust points and badges in the points section.
In general, I noted that most profiles have very little information. Also the navigation is poor; it took me a long time to find information about how to earn points, what the different stats mean and where the leaderboard is. Another difference to Stack Overflow is that most users- even the top users – have no profile pictures, giving the site a less personal touch.
In general the key actions are asking and answering questions. Answers.com overlays 14 different member roles, ranging from Bug Catcher to Wiki Influential Teen. For example, there is a Vandal Patrol with different tasks and positions; the group has a program coordinator by the user name of An8thing and one has to send an email to JoinVandalPatrol@WikiAnswers.com to join. In general, the Community Roles & Programs pages are overloaded and very difficult to make sense of as a first time user.
Answers.com writes that “half the fun of participating in a community is the interaction with your fellow members”. Yet, apart from awarding Trust Points (see below), the options available for interaction are the message boards, discussion pages and the community forum. The various roles that are available are geared towards high-frequency users creating a divide between regulars and addicts that seems hard to bridge.
On top, Answers.com offers Facebook integration to “see your friends’ activity”.
I wasn’t able to identify any particular discover elements. Due to the large amount of activities and roles available, users might feel overwhelmed and not very receptive for discovery elements anyway.
Users can receive Trust Points (see below) from other users when they click on “recommend contributor” to reward that user for a good contribution. Compared to Stack Overflow’s sophisticated reputation system, Answers.com’s system seems much easier to game and not as effective, e.g. as trust points can currently not be substracted.
Trust Points “enable you to vote for a WikiAnswers member whose contributions you think are worthwile and legitimate”. Trust Points can currently not be substracted.
In addition users collect points for answers, edits, questions and other interactions.
The leaderboard is much more complex than Stack Overflow’s. While it is intuitive to have an overall leaderboard, it is not really clear why leaderboards are necessary and useful for the particular categories (questions, answers, community).
Lessons Learnt from Social Design perspective
For a social design perspective, one of the big trends is the “shift from storytelling to storysharing”. Sarah Szalavitz has been developing a strategic framework at 7Robot around the social design of systems using this shift “through choice optimization and behavioral economics”.
The above case studies of Stack Overflow and Answers.com have illustrated the huge impact of these techniques on user experience, which is directly translated into user engagement. For example, Answers.com has very poor choice optimization (for example, it is still unclear to me which group I should join and which badge would be valuable to me) while Stack Overflow has done an excellent job in choice optimization (simple leaderboard, easy to understand system of badges, intelligent time horizons).
Getting game mechanics right matters! The case study of Stack Overflow versus Answers.com has shown how Social Design elements can help to successfully engage online communities. The proposed framework of breaking down the game into User Identity, Actions, Social, Discovery, Feedback and Points has proven helpful in understanding the key differences.
With respect to measuring user engagement effectively, the proposed EROC metric can help to more easily compare traffic frequency data across multiple communities, however falls short to take into consideration the various subtleties of different verticals and user preferences. More work has to be done in this field to provide better metrics to companies and analysts.