Calculating KPIs with non-existent data for some employees

(Martin Guindon) #1


I’m trying to solve a problem and I’m not sure how to best approach it.

I have a few KPIs based on timesheet data such as a billable/non-billable ratio, which can be sliced down by departments, teams, individual people, etc. and by different time frames.

The challenge is this: executives are interested in knowing company-wide KPIs taking into consideration hours worked by employees that do NOT log their time in timesheets, by assuming a fixed number of hours by week. These employees can be identified at a high-level based on their department.

Having them starting to fill timesheets is not an option :sweat:

I assume it might be a mix of liquid logic, derived tables and creative joins between my employees list table and the timesheet data table. But I’m not sure where to start.

Any creative ideas?

(milli.koch) #2

Hi @mguindon!

If you’re looking to use a fixed number for hours worked for the employees without timestamps, solving this might be as simple as adding in a field in your database or creating a dimension with a case statement to insert the fixed number for those employees without timestamp data. I’d love to see some examples of your data tables though to get a better idea of what blanks need to be filled in and how these tables are connected.

Could you provide some clarification on how you’d like to define hours worked for the non-timestamp employees and possibly sketch up a basic example? From there I’d be happy to help brainstorm some ideas for achieving this!


(Martin Guindon) #3

Hi Milli,

I thought about the case statement in the dimension, however I wasn’t sure how to insert a fixed number of hours which takes into consideration the timeframe selected (week, month, quarter, etc.). Perhaps with liquid logic?

In the end we decided to create a cron script which will automatically submit a default timesheet for these employees, this way all systems downstream of the timesheet will work as expected, without any exception logic in any of them. It will be easier to handle exceptions this way (holidays, departures, etc.).

I’m still curious to learn how this challenge could have been approached in Looker, for the sake of self-education.

In terms of data structure, we have one table called “worklogs” which contains individual timesheet entries. A worklog contains a username, a date, time spent, and metadata related to the worklog entry (task id, etc.). An individual worklog can be as small as 15 min. or can be a full day of work depending on the person.

We also have another table containing employee information (team, department, role ,etc.) , which is a SCD type 2 table with an effective start date and end date for each row. We’re joining this table doing a left outer join with the following logic to find the employee details effective on the worklog date:

worklog.username = employees.username
AND >= employees.start_date
AND <= employees.end_date

Thanks :slight_smile:

(milli.koch) #4

Hi Martin,

I think you made the right move here in doing this on the database side. Adding new columns like this is definitely best to do in the ETL process as this will result in better performance than adding them in Looker.

For learning purposes though, I do think we could accomplish this in Looker!

Concerning your question about taking the selected timeframe into consideration, you’re on the right track with liquid. At a high level we could calculate the time spent separately for each set of users and union those into one column. For the salaried employees we can use the date_start and date_end liquid variables in something like this to assign an arbitrary value for time spent:

datediff({% date_start %}, {% date_end %}) * (CASE WHEN role = 'foo' THEN 8 WHEN role = 'bar' THEN 9 END AS total_timespent)

In this calculation, date_start and date_end will be informed by a front end filter and will be multiplied by the hardcoded values (in this case 8 and 9). We can assign that as a time spent column for employees and then union this to the existing worklog table. The resulting output would be a two column table with an id and a value for the total time spent based on the date filter.

You might also have to make some adjustments to only count weekdays, depending on how your work weeks are defined.

If you’re interested, we have a more in depth article explaining how to use the date_start and date_end liquid parameters with date filters here.

Hope that helps!


(Martin Guindon) #5

That makes a lot of sense. These parameters may get handy in the future.

Thanks Milli!