Learn to solve the query to retrieve top customers by total spend - a must-know for data roles at MAANG companies like Amazon and Microsoft. Detailed solution included.
Blog
For any business, knowing who are the biggest spending customers is super important.
It helps you figure out who to focus your marketing efforts on, what products or services they love, and how to keep them coming back for more.
That's why questions around analysing customer spend data are so common in interviews, especially at big tech companies like MAANG (Meta, Amazon, Apple, Netflix, and Google).
One question you're likely to come across is something along the lines of: "Retrieve the top N customers based on how much they've spent over a certain period, and include their total spend amount."
It's a classic query that tests your SQL skills and your ability to turn a business requirement into an efficient database query.
In this article, we will tackle this SQL interview question for data science roles. Let’s get started!
You have a database with two tables:
and Customers
. Orders
The
table contains customer information such as Customers
, CustomerID
, and CustomerName
. CustomerEmail
The
table contains order details like Orders
, OrderID
, CustomerID
, and OrderDate
.TotalAmount
Write a SQL query to retrieve the top 3 customers who have spent the most money on orders in the last year, along with their total spend during that period.
SELECT
c.CustomerName,
SUM(o.TotalAmount) AS TotalSpent
FROM
Customers c
INNER JOIN
Orders o ON c.CustomerID = o.CustomerID
WHERE
o.OrderDate >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR)
GROUP BY
c.CustomerID
ORDER BY
TotalSpent DESC
LIMIT 3;
Here's a breakdown of the query:
SELECT c.CustomerName, SUM(o.TotalAmount) AS TotalSpent
: This part retrieves the CustomerName
from the Customers
table and calculates the total amount spent by each customer using the SUM
aggregate function on the TotalAmount
column from the Orders
table. The total spend is aliased as TotalSpent
. FROM Customers c INNER JOIN Orders o ON c.CustomerID = o.CustomerID
: This line joins the Customers
and Orders
tables based on the CustomerID
column, allowing us to connect customer information with their order details. WHERE o.OrderDate >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR)
: This WHERE
clause filters the Orders
table to include only orders placed within the last year. The DATE_SUB
function subtracts one year from the current date (CURDATE( )
)to get the cutoff date for the last year. GROUP BY c.CustomerID
: This line groups the results by the CustomerID
, so that the SUM
function can calculate the total spend for each individual customer. ORDER BY TotalSpent DESC
: The results are ordered in descending order based on the TotalSpent
column, ensuring that the customers with the highest total spend appear first. LIMIT 3
: This limits the output to the top 3 rows, giving us the top 3 customers who have spent the most money in the last year, along with their respective total spends. The query first joins the
and Customers
Orders
tables to connect customer information with their order details.
It then filters the
table to include only orders placed within the last year, groups the results by Orders
, calculates the total spend for each customer using CustomerID
, orders the results by the total spend in descending order, and finally limits the output to the top 3 rows.SUM(TotalAmount)
This query provides the top 3 customers based on their total spend in the last year, along with their respective
and CustomerName
values.TotalSpent
If you'd like to learn more about tackling such SQL interview questions or want to practice with additional examples, feel free to connect with me.
I'd be happy to share my insights and help you prepare for success in your upcoming interviews.
Copyright ©2024 Preplaced.in
Preplaced Education Private Limited
Ibblur Village, Bangalore - 560103
GSTIN- 29AAKCP9555E1ZV