Step-by-step solution to a popular JPMorgan data engineering interview question involving recursive SQL queries to find employees reporting to the company head hierarchically.
Blog
Hey there! Nishchay here, a Data Engineer 3 at Walmart.
I've been working in the field of data engineering for quite some time now, and I've had the opportunity to work and interview with some of the biggest companies in the industry.
Today, I want to share with you an interesting question that was asked during one of my interviews at JPMorgan for a Data Engineer position.
The question was related to finding employees who report (directly or indirectly) to the head of the company using SQL.
Letโs look at the question and itโs solution.
Write an SQL query to find the employee IDs of all employees who directly or indirectly report their work to the head of the company.
Given the following table structure:
Your query should return all employee IDs (including the head of the company) who directly or indirectly report to the head of the company.
To achieve the goal of finding all employees who directly or indirectly report to the head of the company (employee with employee_id = 1
), we can use a recursive Common Table Expression (CTE) in SQL.
This approach allows us to traverse the hierarchical manager-employee relationship in the table.
Here's how you can write the SQL query:
WITH RECURSIVE EmployeeChain AS (
-- Base case: Select the head of the company
SELECT employee_id, manager_id
FROM Employees
WHERE employee_id = 1
UNION ALL
-- Recursive step: Select employees who report (directly or indirectly) to those already in the chain
SELECT e.employee_id, e.manager_id
FROM Employees e
INNER JOIN EmployeeChain ec ON e.manager_id = ec.employee_id
)
SELECT employee_id
FROM EmployeeChain;
๐ Recursive CTE Initialization (WITH RECURSIVE
):
This part of the query defines a recursive common table expression named EmployeeChain
. The CTE is used to recursively find all employees under the head of the company.
๐ Base Case:
The query starts with the head of the company (employee_id = 1
) as the initial member of the chain. This ensures that the head of the company is included in the result set.
๐ Recursive Step:
The query then recursively joins the Employees
table with the EmployeeChain
CTE. This step is crucial for traversing the hierarchy.
It matches each employee's manager_id
with the employee_id
of employees already included in the EmployeeChain
.
This process repeats until no more employees can be added to the chain, effectively capturing all employees who report directly or indirectly to the head.
๐ Final Selection:
The final SELECT
statement retrieves the employee_id
from the EmployeeChain
CTE. This result set includes all employees who report, directly or indirectly, to the head of the company.
This query leverages the hierarchical structure of the manager-employee relationship to find all employees connected to the head of the company, showing the power of recursive CTEs in handling hierarchical data in SQL databases.
*******
You can read my full JPMorgan Data Engineer Interview Experience here.
If you're aiming for top firms and aren't afraid of putting in the hard work, I'd love to be your interview prep buddy!
Having gone through this process myself, I can share tips and resources on everything from scalable data modeling to estimating query runtimes.
We can walk through practice questions together, dissect optimal approaches, and get you interview-ready.
Copyright ยฉ2024 Preplaced.in
Preplaced Education Private Limited
Ibblur Village, Bangalore - 560103
GSTIN- 29AAKCP9555E1ZV