About This Blog

.net & SQL Samples, programming tips and tricks, performance tips, guidelines, and best practices

Tuesday 27 March 2012

SQL Server # Storing Hierarchical Data – Parent Child n’th level # TSQL

Introduction

Today, I would like to explain one way in which we can store the HIERARCHICAL data in SQL tables. A general table structure which people come up to store this kind of data is -

1

Where, EmployeeID id the UniqueID alloted to every new employee record inserted into the table and ManagerID is the EmployeeID of the immediate manager of the employee. Keeping in mind that Manager is also an employee.

Problem Statement

This table structure very well serves the purpose as long as we have 1-Level hierarchy. However, if the hierarchy is of n'th level, the SELECT statement to fetch the records becomes more complex with this kind of table structure. Suppose, we want to fetch the complete TREE of a particular employee, i.e. list of all the employees who are directly or indirectly managed by a particular employee. How to do it……..?

Thanks to CTE’s for making the life a bit easier – as using them in a recursive manner, we can get the work done. Please follow this msdn link to see an implementation using recursive CTE.

Suggested Table Structure

2

Here, I have just included a new column [PATH]. It is of VARCHAR(MAX) type. I have taken it as VARCHAR(MAX) just to make sure the field is long enough to store the complete path. But one can assign appropriate size as per their system’s requirement.

The basic idea of the [path] column is to store the complete hierarchical path of any employee separated by a delimiter as under -

3

Calculating the new path is very simple. It’s just, {New Path} = {Parent Path} + {Self ID} + {Delimiter}

Now, suppose if I want to fetch all the employees who are directly or indirectly working under EmployeeID = 2, I can use the below tsql -

;WITH CTE AS (
SELECT 1 EmployeeID,NULL ManagerID, '\1\' [Path]
UNION ALL    
SELECT 2 EmployeeID,1 ManagerID, '\1\2\' [Path]
UNION ALL    
SELECT 3 EmployeeID,1 ManagerID, '\1\3\' [Path]
UNION ALL    
SELECT 4 EmployeeID,2 ManagerID, '\1\2\4\' [Path]
UNION ALL    
SELECT 5 EmployeeID,4 ManagerID, '\1\2\4\5\' [Path]
)
SELECT
  *
FROM
  CTE
WHERE
  [Path] LIKE '%\2\%'

We can use a simple logic to even find out the level of the Employee -


SELECT
  *,
  (LEN([Path]) - LEN(REPLACE([Path],'\',''))) - 2 [Level]
FROM
  CTE
WHERE
  [Path] LIKE '%\2\%'

4

2 is subtracted from the formula as the length of delimiter for Level-0 is 2.

Conclusion

Hope, this simple trick could save a lot of time for the ones who find themselves lost playing with the hierarchical data.

4 comments:

  1. Pretty bad example. Even SQL help's recursive cte does a better job of building a hierarchical structure. Has no relationship to the new hierarchyid. The level logic is contrived and only works correctly on single digit numbers. (See GetLevel method of hierarchyid)

    ReplyDelete
    Replies
    1. Hello,
      Thanks for posting back your concerns. I will try and answer them taking one at a time -

      1. "recursive cte does a better job of building a hierarchical structure" - Could you please explain what exactly do you mean by "better job"; I mean in which context? If you are talking about performance, then I would request you to please come up with your test script to support it.

      2. "no relationship to the new hierarchyid" - Yes, you are correct as the intention was to come up with a simple logic which could be used for the versions lower than 2008.

      3. "The level logic is contrived and only works correctly on single digit numbers" - Below is the script which proves it even works for multiple digits

      ;WITH CTE AS (
      SELECT 1 EmployeeID,NULL ManagerID, '\1\' [Path]
      UNION ALL
      SELECT 12 EmployeeID,1 ManagerID, '\1\12\' [Path]
      UNION ALL
      SELECT 3 EmployeeID,1 ManagerID, '\1\3\' [Path]
      UNION ALL
      SELECT 24 EmployeeID,2 ManagerID, '\1\12\24\' [Path]
      UNION ALL
      SELECT 5 EmployeeID,4 ManagerID, '\1\12\24\5\' [Path]
      )
      SELECT
      *,
      (LEN([Path]) - LEN(REPLACE([Path],'\',''))) - 2 [Level]
      FROM
      CTE
      WHERE
      [Path] LIKE '%\12\%'

      Please revert if you feel otherwise.

      Delete
  2. Sorry, I was wrong about the 2 digit numbers, but the level is still contrived.

    ReplyDelete
    Replies
    1. Thanks for reverting so promptly. This is what I want to know - how the level is contrived?

      Can you please elaborate more on this. I will try and correct myself if at all what you are saying is correct.

      Delete