Parsing user agent into device_type, manufacturer, browser

I’m currently using a fairly rudimentary case statement to parse useragent strings into ‘device_type’ (see below).

I was wondering if anyone has done something similar for device manufacturer and/or browser they’d be willing to share.

There are certainly lots of nuances & limitations to accurate useragent detection which will impact the accuracy of data derived from these strings, but it seems the best short-term bath to get a view into what devices and browsers customers are visiting us from.

If you have alternate approaches, I’m all ears as well!

Thanks!

CASE
    WHEN ${landing_page_url} LIKE 'file:///private/var/mobile/%' OR ((${useragent} LIKE '%iphone%' OR ${useragent} LIKE '%iPhone%' OR ${useragent} LIKE '%Windows mobile%' OR ${useragent} LIKE '%Windows phone%' OR ${useragent} LIKE '%Windows Phone%' OR ${useragent} LIKE '%Nexus 5%' OR ${useragent} LIKE '%GTI-9300%' OR ${useragent} LIKE '%Nokia%' OR ${useragent} LIKE '%SGH-M919V%' OR ${useragent} LIKE '%SCH-%' OR ${useragent} LIKE '%Mobile%' OR ${useragent} LIKE '%Opera mini%') AND (${useragent} NOT LIKE '%iPad%')) THEN 'mobile'
    WHEN ((${useragent} LIKE '%Windows%' OR ${useragent} LIKE '%WOW64%' OR ${useragent} LIKE '%Intel Mac OS%' OR ${useragent} LIKE '%Windows NT 6.1; Trident/7.0%' OR ${useragent} LIKE '%Media Center PC%') AND (${useragent} NOT LIKE '%iPad%')) THEN 'desktop'
    WHEN (${useragent} LIKE '%Tablet PC%' OR ${useragent} LIKE '%Touch%' OR ${useragent} LIKE '%MyPhone%' OR ${useragent} LIKE '%iPad%' OR ${useragent} LIKE '%ipad%' OR ${useragent} LIKE '%Tablet%') THEN 'tablet'
    WHEN (${useragent} LIKE '%Baiduspider%') ELSE 'unknown'
END
1 Like

I wrote some of this myself for our internal usage. Here is my code (warning, its gnarly). I’d add, I’m not an expert here, so this was based on a couple hours of research and there may be corner cases I’m missing. I’d also add that the code below is for Redshift, so may need mild tuning for other dialects.

  - dimension: user_agent
    sql: ${TABLE}.user_agent

  - dimension: browser
    sql: |
      CASE
        WHEN ${user_agent} LIKE '%Firefox/%' THEN 'Firefox'
        WHEN ${user_agent} LIKE '%Chrome/%' OR ${user_agent} LIKE '%CriOS%' THEN 'Chrome'
        WHEN ${user_agent} LIKE '%MSIE %' THEN 'IE'
        WHEN ${user_agent} LIKE '%MSIE+%' THEN 'IE'
        WHEN ${user_agent} LIKE '%Trident%' THEN 'IE'
        WHEN ${user_agent} LIKE '%iPhone%' THEN 'iPhone Safari'
        WHEN ${user_agent} LIKE '%iPad%' THEN 'iPad Safari'
        WHEN ${user_agent} LIKE '%Opera%' THEN 'Opera'
        WHEN ${user_agent} LIKE '%BlackBerry%' AND ${user_agent} LIKE '%Version/%' THEN 'BlackBerry WebKit'
        WHEN ${user_agent} LIKE '%BlackBerry%' THEN 'BlackBerry'
        WHEN ${user_agent} LIKE '%Android%' THEN 'Android'
        WHEN ${user_agent} LIKE '%Safari%' THEN 'Safari'
        WHEN ${user_agent} LIKE '%bot%' THEN 'Bot'
        WHEN ${user_agent} LIKE '%http://%' THEN 'Bot'
        WHEN ${user_agent} LIKE '%www.%' THEN 'Bot'
        WHEN ${user_agent} LIKE '%Wget%' THEN 'Bot'
        WHEN ${user_agent} LIKE '%curl%' THEN 'Bot'
        WHEN ${user_agent} LIKE '%urllib%' THEN 'Bot'
        ELSE 'Unknown'
      END

  - dimension: browser_version
    sql: |
      CASE
        WHEN ${browser} = 'Firefox'
          THEN SUBSTRING(${user_agent}, POSITION('Firefox' IN ${user_agent}) + 8, 100)
        WHEN ${browser} = 'Safari'
          THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)
        WHEN ${browser} = 'Chrome'
          THEN LEFT(
                    SUBSTRING(${user_agent}
                              , POSITION('Chrome' IN ${user_agent}) + 7
                              , 100)
                    , POSITION(' ' IN SUBSTRING(${user_agent}
                                                , POSITION('Chrome' IN ${user_agent}) + 7
                                                , 100)
                              )
                    )
        WHEN ${user_agent} LIKE '%Trident%'
          THEN '11.0'
        WHEN ${browser} = 'IE'
          THEN SUBSTRING(${user_agent}, POSITION('MSIE' IN ${user_agent}) + 5, 4)
        WHEN ${browser} = 'iPhone Safari'
          THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)
        WHEN ${browser} = 'iPad Safari'
          THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)
        ELSE 'Unknown'
      END

  - dimension: platform_start
    type: number
    hidden: true
    sql: |
      POSITION('(' IN ${user_agent}) + 1
  
  - dimension: platform_raw
    hidden: true
    sql: |
      SUBSTRING(${user_agent}, ${platform_start}, 100)

  - dimension: platform_end
    type: number
    hidden: true
    sql: |
      CASE
        WHEN POSITION(';' IN ${platform_raw}) = 0
        THEN POSITION(')' IN ${platform_raw})
        ELSE POSITION(';' IN ${platform_raw})
      END

  - dimension: platform_end_2
    type: number
    hidden: true
    sql: |
      CASE WHEN ${platform_end} = 0 THEN 0 ELSE ${platform_end} - 1 END


  - dimension: platform
    sql: |
      SUBSTRING(${user_agent}, ${platform_start}, ${platform_end_2})
3 Likes

Colin - this is fantastic; super helpful! Many thanks!

@colin, edit to the Chrome logic:

WHEN ${user_agent} LIKE '%Chrome/%' OR ${user_agent} LIKE '%CriOS%' THEN 'Chrome'

per https://developer.chrome.com/multidevice/user-agent

1 Like

Edited, thanks.

Thanks Colin :smile: