Community

[Analytic Block] Peer Comparison: Compare Against Rest of Population

About This Block

Note: you can find an alternative pattern for multi-level peer comparison and “share of wallet” comparison here

Peer comparison analysis shows you how one group, categorized by an individual or set of determinants, is performing against another group. You can take this one step further, and compare that first group against of the rest of the population of users/items. This helps companies get a comprehensive view of an individual item’s performance based on any number of attributes that you collect about that product, user, etc. Peer comparison has helped companies answer many questions, such as:

  • How is Sales Rep A performing compared to others in his/her geography? Compared to the national average?
  • How is the average weekly growth of Stock A trending compared to the rest of the portfolio?
  • What is the customer conversion rate for visitors referred from Google and Facebook compared to all other channels?
  • What is the most popular or profitable category of item sold by Brand A, compared to every other brand?

In this example, we will demonstrate how you can implement peer comparison, or Rest of the Population, analysis in Looker using a sample eCommerce dataset. In the process, we will uncover two key takeaways:

  1. How average profit dollars vary between item categories across brands
  2. Which item category is the most popular in one brand compared, to all other brands

Ideal Data Types

This Block is best suited for datasets containing individual or groups of populations such as brands, sales representatives, regions, etc. This data may come from Salesforce, web analytics tools (Google Analytics, Segment, etc.), and other transactional data tied to groups that can be singled out for comparison against everybody else.

In terms of KPIs, averages or percent-of-total calculations work best, as these KPIs enable you to look at apples-to-apples comparisons across groups. On the flip side, totals and cumulative calculations will skew the results towards larger groups, especially for the Rest of the Population category. We will explore both results to show you what kind of insights you should be looking for in peer comparisons.

Expected Output

Peer comparison analysis allows you to see how one individual / group is performing compared to the rest of the population in one or more KPIs, as demonstrated below:

[Explore Data in Full Screen] (https://blocks.looker.com/embed/explore/peer_comparison/peers?show=data&vis={}&query=7Yd8fKw&filter_config={"peers.brand_select":[{"type":"%3D","values":[{"constant":"Calvin%20Klein"},{}],"id":0}]}&f[peers.brand_select]=Calvin%20Klein)

This sample image compares the number of Calvin Klein products with the rest of the products offered by this retailer. With this info in hand, a brand manager may decide to focus on marketing campaigns to increase sales for Calvin Klein by focusing on the items with the highest product count.

Try it yourself!

How It’s Done

This is a very simple and easy-to-implement pattern utilizing both filter-only fields and Liquid parameters to specify individual groups to compare with the rest of the population. Part of the beauty of this approach of using Liquid in Looker is that it allows end-users to make the item selection straight from the user interface, which is then safely inserted as SQL into our optimizer.

We will first start with creating a filter-only field in LookML. Filter-only fields are created using the following syntax:

 filter: insert filter field name here {
  suggest_dimension: (insert dimension name where potential values come from)
}

The suggest_dimension parameter enables you to use an existing dimension from which to select your desired individual / group. While the use of suggest_dimension is optional, it is highly recommended, especially if you have dimensions referring to brand names, sales representative names, region / state names, etc., to ensure users select the correct values rather than mistyping them using free form text.

For our example, we’ll want to define a filter-only field to select specific brand names as described below:

 filter: brand_select {
  suggest_dimension: brand.name
  }

The second field we’ll define is a comparitor dimension. This is where Liquid parameters come into play:

 dimension: brand_comparitor {
    type: string
  sql: 
    CASE
      WHEN {% condition brand_select %} ${brand_name} {% endcondition %}
        THEN ${brand_name}
      ELSE 'Rest of Population'
    END ;;
  }

What’s happening here is the following:

  1. The brand_comparitor dimension reads the values specified in the brand_select filter
  2. For each brand_name entry in the data table, if the brand_name value matches the value in brand_select, that same brand_name is returned.
  3. If the brand_name value does not match what’s in brand_select, it gets grouped into “Rest of Population.”

Now you can select any measure and slice it by this brand_comparitor to see the results grouped by each value captured in brand_select, as well as all other values captured in “Rest of Population”!

With this Looker Analytics Block under your belt, you’re on your way to creating more peer comparison patterns. Some common combinations of filter-only fields and comparitor dimensions include:

  • Sales reps
  • Stock and other securities
  • Product categories
  • SKUs in inventory management
  • etc.

Further Analysis

Compare to Entire Population

This design can also be used to compare groups to the entire population just by tweaking the LookML logic. Let’s take a look at the following LookML:

**Model File**
explore: products {
  hidden: true
  
   join: num {
    type: cross
    relationship: one_to_many
    }

}

**View File**
view: num {
  derived_table : {
    sql: SELECT 1 as n UNION SELECT 2 ;;
}
  dimension: n {
    hidden: true
}

Notice the cross join to num containing the values 1 and 2; we’ve just created a fanout in order to obtain two rows per entry in products so we can aggregate data for selected groups using num.n = 2. Though other groups will each have two rows, the one_to_many relationship lets Looker know to be aware of this fanout in order to avoid double-counting, and correctly calculate symmetric aggregates.

Next, we’ll edit the brand_comparitor dimension to reflect the change in business logic:

dimension: brand_comparitor {
type: string
    sql: 
      CASE
        WHEN {% condition brand_select %} ${brand_name} {% endcondition %} AND ${num.n} = 1 
          THEN ${brand_name}
        ELSE 'Total Of Population'
      END ;;
}

This CASE statement runs through the following steps:

  1. For each row, look for brand_name that matches brand_select, and where num.n = 1
  2. If brand_name indeed matches brand_select, that entry gets grouped by brand_name
  3. If the brand_name doesn’t match brand_select or num.n = 2, that entry gets thrown into the Total of Population category

Due to the one_to_many relationship, Looker will use COUNT DISTINCT instead of COUNT to accurately aggregate whatever measure you’re using.

The following is an example of what the results look like:

3 Likes