Previously, I wrote on the basics of utilizing the VLOOKUP function in Microsoft Excel. This session will expand upon those basics in more of a “real-world” scenario. In this example, a timekeeping table has been extracted to show Labor Date, Employee ID, and Hours Worked. Using the VLOOKUP function, we’ll add a labor premium based on the start time of each employee, as well as the total pay, job title, and employee name. Please note that this discussion assumes a certain level of comfort with the IF function. If you’re currently hesitant with using this function, leave a comment on this blog or stay tuned or a future session where the IF statements within Microsoft Excel, Microsoft Access/SQL (IIF), and ACL are reviewed.
VLOOKUP Example
Reference File: http://www.box.net/shared/jbn8o4xi75
Worksheets
- Summary Table – Pivot table showing total pay by employee-job title.
- vLookup Example – Primary table showing the [Example Data] and the VLOOKUP data
- Employee Index – Index table of employees by employee ID with the job code of each individual
- Job Index – Index table of job titles by job code
- Start-Time Premiums – Index table of wage premiums based on employee start time
Wage Premium (Column D in [vLookup Example])
This example, proposed by a colleague, shows a scenario where an analyst may wish to set the ABSOLUTE LOOKUP FLAG to TRUE. A review of [Start-Time Premiums] shows that employees that start between 20:00 and 06:00 receive 120% of their base pay, employees starting between 06:00 and 12:00 or 18:00 and 20:00 receive 100% of their base pay, and employees starting between 12:00 and 18:00 receive only 98% of their base pay. In a scenario where the ABSOLUTE LOOKUP FLAG is set to FALSE, an exact match would be required, which would lead to a significantly larger index table. In this example, however, we are able to utilize a simple index table by setting the ABSOLUTE LOOKUP FLAG to TRUE.
Cell D3: =VLOOKUP(TIME(HOUR(A3),MINUTE(A3),0),’Start-Time Premiums’!$A$3:$B$7,2,TRUE)
The time component of the Labor Date, a Date/Time field, is extracted and looked up within the [Start-Time Premiums] index table. For Row 3, this is 08:25, which does not exist within the index. However, since the flag has been set to TRUE, the VLOOKUP function returns 100%, the reference for start times between 06:00 and 12:00. For Row 8, 17:55 is the start time, for which the VLOOKUP function returns 98%, the reference for start times between 12:00 and 18:00. So, when dealing with Date/Time indexes, where absolute matches may lead to excessively-large index files, setting the ABSOLUTE LOOKUP FLAG to TRUE may be the optimal solution.
Pay (Column E in [vLookup Example])
>Cell E3: =IF(ISERROR(VLOOKUP(B3,’Employee Index’!$A$3:$D$28,3,FALSE)),”_Unknown”,ROUND(D3*IF(C3<=8,C3,IF(C3<=12,8+(1.5*(C3-8)),8+(1.5*4)+(2*(C3-12))))*VLOOKUP(B3,’Employee Index’!$A$3:$D$28,3,FALSE),2))
There’s a lot going on in the cell, so let’s first look at what we’re trying to do. We know when the employee started work and how many hours they worked (in this case 08:25 and 8.6 hours). So, the VLOOKUP needs to be utilized to pull the hourly wage for the individual, but an added level of analysis must be performed to determine straight-time (1x), overtime (1.5x), and double-time (2x) hours; in this example, the employee worked his/her eight hours of straight-time plus 0.6 hours of overtime (we’re ignoring weekends and holidays in this example). The equation must also pull the wage premium (already calculated in column D). Finally, an error handler is added to return “_Unknown” if the employee ID does not exist within [Employee Index]; conditional formatting has already been added to colour the text red if the equation returns “_Unknown”. The equation will now be broken down into its components:
<<A>> =IF(ISERROR(<<B>>),”_Unknown”,<<C>>)
<<B>> VLOOKUP(B3,’Employee Index’!$A$3:$D$28,4,FALSE)
<<C>> ROUND(D3*<<D>>*<<B>>,2)
<<D>> IF(C3<=8,C3,<<E>>)
<<E>> IF(C3<=12,8+(1.5*(C3-8)),8+(1.5*4)+(2*(C3-12)))
<<A>> utilizes the ISERROR() function, which returns TRUE if there’s a problem or false if the equation executes correctly. In this case, “_Unknown” will be returned if <<B>> returns an error, while <<C>> will be returned if <<B> does not return an error (since the ABSOLUTE LOOKUP FLAG has been set to FALSE in <<B>>, an error will occur if the value being “looked up” does not exist in the dataset).
<<B>> This is the actual VLOOKUP function, which is checking the value in B3 (the Employee ID) and pulling the fourth column from ‘Employee Index’!$A$3:$D$28 (Hourly Wage) if an exact Employee ID match is found (notice that Employee ID is a text field, not a numeric field).
<<C>> This simply multiplies the wage premium (D3) by the converted number of straight-time hours worked (<<D>>) and the hourly wage returned by the VLOOKUP function (<<B>>) and rounds to the nearest penny. Simply put, this is calculating the exact pay due to the employee for the hours worked that day.
<<D>> This IF statement checks to see if the employee worked 8 hours or less (straight-time only) and, if so, returns the number of hours worked; if the employee worked more than 8 hours, <<E>> is returned.
<<E>> Slightly more complicated, this IF statement already knows that the employee worked more than 8 hours, but checks first to see if the employee worked 12 hours or less (8 straight-time hours and an overtime component). If this is true, <<E>> returns 8 hours (straight-time) plus 1.5 times the remaining hours (this converts the overtime hours to the straight-time equivalent). If the employee worked more than 12 hours (8 straight-time hours, 4 overtime hours, and the rest as double-time), then <<E>> returns 8 hours (straight-time) plus 1.5*4 (6 straight-time hours equal 4 overtime hours) plus 2 times the remaining hours (this converts the double-time hours to the straight-time equivalent).
Therefore, in Row 3, $271.98 is returned as the pay for the day. This is because Employee 2304 exists in ‘Employee Index’ and receives $30.56 per hour. Since the employee started between 06:00 and 12:00, the wage premium is simply 100% of pay. Finally, since the employee worked 8.6 hours, <<E>> converts this to 9.2 straight-time hours. That means that the entire equation ends up being 100% * 8.9 * $30.56 rounded to the second decimal place, which equals $271.98.
Alternatively, in Row 9, you can see that the formula returns “_Unknown”, because employee 7205 does not exist within [Employee Index].
For further growth with Microsoft Excel formulas, including the IF statement and VLOOKUP function, manually calculate some of the other values in Column E to see if you can verify the formula’s result.
Job Title (Column F in [vLookup Example])
Cell F3: =IF(ISERROR(VLOOKUP(B3,’Employee Index’!$A$3:$D$28,3,FALSE)),”_Unknown”,IF(ISERROR(VLOOKUP(VLOOKUP(B3,’Employee Index’!$A$3:$D$28,3,FALSE),’Job Index’!$A$3:$B$7,2,FALSE)),”_Unknown”,VLOOKUP(VLOOKUP(B3,’Employee Index’!$A$3:$D$28,3,FALSE),’Job Index’!$A$3:$B$7,2,FALSE)))
Within this column, the formula needs to lookup the employee ID in [Employee Index], find the individual’s job code, and look that up within the [Job Index] table. If either the Employee ID or Job Code is not found within the index tables, the formula needs to return “_Unknown”. To accomplish this, two IF statements need to run the ISERROR() check and a nested VLOOKUP formula to find the Job Code must be utilized within the VLOOKUP formula that returns the Job Title. So, the equation may be broken into the following components:
<<A>> =IF(ISERROR(<<B>>),”_Unknown”,<<C>>)
<<B>> VLOOKUP(B3,’Employee Index’!$A$3:$D$28,3,FALSE)
<<C>> IF(ISERROR(<<D>>),”_Unknown”,<<D>>)
<<D>> VLOOKUP(<<B>>,’Job Index’!$A$3:$B$7,2,FALSE)
<<A>> This is the initial function that checks to see if <<B>> (the Job Code lookup) returns an error (i.e. the employee ID does not exist within [Employee Index]). If an error is returned, the function returns “_Unknown”; if there is no error, <<C>> is executed.
<<B>> This is the VLOOKUP that returns the Job Code from [Employee Index] by looking for the Employee ID (B3) within ‘Employee Index’!$A$3:$D$28 and returning the third column (Job Code).
<<C>> This is the second IF statement, which checks to see if the selected job code exists within [Job Index]. If the Job Code does not exist, an error is returned, and the cell will display “_Unknown”; if the Job Code does exist, <<D>> is executed to pull the Job Title.
<<D>> This is the final VLOOKUP, which pulls the Job Code returned by <<B>> and looks for it within ‘Job Index’!$A$3:$B$7. Once found, the second column, Job Title, is returned.
Through these nested IF and VLOOKUP statements, an index embedded within a primary index table (in this case, Job Code within the [Employee Index]) may be pulled from a secondary index table ([Job Index]). Through the use of the two IF statements as error handlers, the function will return “_Unknown” if either the employee ID or job code do not exist; as stated in the previous example, the cells contain conditional formatting to highlight “_Unknown” as red in order to be flagged for further review.
Employee Name (Column G in [vLookup Example])
Cell G3: =IF(ISERROR(VLOOKUP(B3,’Employee Index’!$A$3:$B$28,2,FALSE)),”_Unknown”,VLOOKUP(B3,’Employee Index’!$A$3:$B$28,2,FALSE))
This is one of the more basic VLOOKUP functions contained within this example. Simply put, the Employee ID is being searched within the [Employee Index]. If it exists, the second column, Employee Name, is returned; if it does not exist, an error is returned, which the IF statement catches in order to display “_Unknown”.
Perils of ABSOLUTE LOOKUP FLAG to TRUE
Columns H through J have been added to again highlight the perils of setting the ABSOLUTE LOOKUP FLAG to TRUE. In this example, rows 9 and 33 contain employee IDs that are not listed within [Employee Index]. When the ABSOLUTE LOOKUP FLAG is set to FALSE, the error handler is returning “_Unknown”, which, through Conditional Formatting, has been set to be highlighted in red. Notice, however, within Columns H through J that, instead of flagging an employee that does not exist, an actual employee is being shown as receiving the pay. This could, at best, throw off the validity of the data analysis and, at worst, fail to uncover potential fraud or inaccurate source data. Therefore, always remember to properly set the ABSOLUTE LOOKUP FLAG when utilizing the VLOOKUP formula and, when in doubt, set it to FALSE.