Author(s): Gitanjali
Originally published on Towards AI.
SQL powers most data analyst roles today. If you work with data, chances are you spend a large portion of your day writing SQL queries.
SQL allows you to work directly with data stored in a database such as PostgreSQL, MySQL, BigQuery, or SQL Server. For many common analytical tasks. Especially when datasets are not very large, SQL is often faster and more efficient than loading data into Python and using pandas.
This guide is not about learning every SQL command. it focuses on Questions analysts actually use dailyThese types of queries are used to filter data, join tables, calculate metrics, and answer business questions,
we’ll cover around 20 essential SQL questionsUsing realistic sales and e-commerce style datasets. Each query includes clear examples that you can copy, run, and practice. These are the same patterns that are used in real analyst jobs.
You can practice using free tools like DB-Fiddle, BigQuery public sandbox, or a local MySQL/PostgreSQL setup. A simple sample sales database will be used throughout the guide to make everything sound practical, not theoretical.
Writing SQL regularly matters. Analysts who practice querying data every day become faster at analysis, debugging, and problem-solving, and that skill translates directly into better job opportunities and stronger portfolios.
Sample Database Setup (Run this in your SQL editor first):
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(50),
city VARCHAR(30),
join_date DATE
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
product VARCHAR(50),
amount DECIMAL(10,2),
order_date DATE
);
CREATE TABLE products (
product_id INT PRIMARY KEY,
product VARCHAR(50),
category VARCHAR(30),
price DECIMAL(10,2)
);
INSERT INTO customers VALUES
(1,'Alice','Delhi', '2024-01-15'),
(2,'Bob','Mumbai','2024-02-20'),
(3,'Charlie','Delhi','2024-03-10');
INSERT INTO orders VALUES
(101,1,'Laptop',15000,'2024-06-01'),
(102,1,'Mouse',500,'2024-06-05'),
(103,2,'Laptop',15000,'2024-06-10'),
(104,3,'Phone',20000,'2024-06-15'),
(105,1,'Phone',20000,'2024-07-01');
Key Questions: Select, Where, Sort By, Range
1. Basic Selection Grab the column or everything.
SELECT name, city FROM customers; -- Specific columns
SELECT * FROM orders WHERE amount > 10000 ORDER BY order_date DESC LIMIT 3;
Output: Top 3 high-value orders, newest first. danger, * Slow on large table-name columns. analyzer usage:Quick data preview.kdnuggets+1
2. Filtering by WHERE And/or/not/in/between/like.
SELECT * FROM orders
WHERE customer_id IN (1,2) AND amount BETWEEN 500 AND 15000
AND product LIKE 'Laptop%'; -- Wildcards: % any, _ one char
pro tip, >= Dates: WHERE order_date >= '2024-06-01', Zero, IS NULL No = NULL,Linkedin
3. Sorting and Limiting – ORDER BY ASC/DESC, LIMIT/TOP (SQL Server).
SELECT customer_id, SUM(amount) AS total_spent
FROM orders GROUP BY customer_id
ORDER BY total_spent DESC LIMIT 2;
Surname (AS) Clear output for report/export to Excel/Python.
Aggregation: GROUP BY, HAVING, COUNT/SUM/AVERAGE
4. Group based on basics -Summarize by category.
SELECT city, COUNT(*) AS customers, AVG(amount) AS avg_order
FROM customers c JOIN orders o ON c.customer_id = o.customer_id
GROUP BY city;
Where is it not? Where filter rows are pre-grouped; To be posted.LinkedIn+1
5. To be for the set -Filter group.
SELECT customer_id, COUNT(order_id) AS orders
FROM orders
GROUP BY customer_id
HAVING COUNT(order_id) > 1 AND SUM(amount) > 20000;
Business: Marketing to high-value repeat buyers.
Connects: 80% daily workhorse
The data lives in the tables – JOIN sticks them. main types,
| Join Type | What It Keeps | Common Use Case | Visual Idea |
|------------------|----------------------------------------|------------------------------------------------------|-------------|
| INNER JOIN | Only matching rows from both tables | Customers who placed orders | Overlap |
| LEFT JOIN | All rows from left + matches from right| All customers and their orders (NULL if none) | Left full |
| RIGHT JOIN | All rows from right + matches from left| All products and their sales (NULL if none) | Right full |
| FULL OUTER JOIN | All rows from both tables | Combining two datasets with partial overlap | Both full |
If you want it to be more effective (and on readers) In fact remember this), ok after the tableadd a clean
6. Inner join -most common.
SELECT c.name, o.product, o.amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;
7. Join left: All customers, even no orders.
SELECT c.name, COUNT(o.order_id) AS orders_placed
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;
danger:Multiple matches extend COUNT-use DISTINCT.abeltavares.hashnode+1
8. Multi-Table + Self-JoinProducts to order.
SELECT o.product, p.category, SUM(o.amount) AS revenue
FROM orders o
JOIN products p ON o.product = p.product
GROUP BY o.product, p.category;
Subcategory and CTE: break the complex into simple
9. Subquery: Query within query.
SELECT name FROM customers
WHERE customer_id IN (
SELECT customer_id FROM orders
WHERE amount > 15000
);
Display:correlated (runs per row) slow, avoid big data.youtube
10. CTE (with) -Nominated temporary result, readable.
WITH high_spenders AS (
SELECT customer_id, SUM(amount) AS total
FROM orders GROUP BY customer_id HAVING total > 20000
)
SELECT c.name, hs.total FROM high_spenders hs
JOIN customers c ON hs.customer_id = c.customer_id;
recurrent cte:Hierarchy (Organization Chart): WITH RECURSIVE,abeltavares.hashnode+1
Window Functions: Ranking without messy self-joins
11. Row Number/Rank/Interval/Lead -Analytics Gold.
SELECT
customer_id,
amount,
order_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS order_seq,
LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS prev_amount
FROM orders;
- ROW_NUMBER: Unique sequence.
- Rank: Ties Share (1,1,3).
- LAG: Previous row value (YoY increase).abeltavares.hashnode
12. Running Total and Percentile,
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date ROWS UNBOUNDED PRECEDING) AS running_total
FROM orders;
String/Date Function: Clear Actual Data
13. wire – Upper/Lower/Len/Substring.
SELECT
UPPER(name) AS name_upper,
LENGTH(product) AS name_len,
SUBSTRING(product, 1, 3) AS short_prod -- MySQL; SUBSTR PostgreSQL
FROM customers c JOIN orders o ON c.customer_id = o.customer_id;
14. Dates – Essential for analysts.
SELECT
order_date,
DAY(order_date) AS day,
MONTHNAME(order_date) AS month, -- MySQL
DATEDIFF('2024-12-27', order_date) AS days_ago
FROM orders;
group by month, GROUP BY YEAR(order_date), MONTH(order_date),geeksforgeeks
Advanced: Case, Unions, Ideas
15. case -If-then in SQL.
SELECT
name,
CASE
WHEN total > 20000 THEN 'VIP'
WHEN total > 10000 THEN 'Regular'
ELSE 'Newbie'
END AS segment
FROM (/* your CTE */) t;
16. Union Stack results (same column).
SELECT name FROM customers
UNION
SELECT product FROM products;
17. Create a scene – Saved queries.
CREATE VIEW monthly_sales AS
SELECT MONTH(order_date) AS month, SUM(amount) AS revenue
FROM orders GROUP BY month;
-- Use: SELECT * FROM monthly_sales;
Performance and Best Practices
- index,
CREATE INDEX idx_customer ON orders(customer_id);– Where/Join speed. - explain:Prefix questions to see the plan.
- **Avoid Selection*** – Name Column.
- BigQuery Tip: Partition tables by date.
common errors,
error , fix ,
— — — -, — — -,
ORA-00904 invalid identifier , Column name wrong.
Syntax error near ‘)’. Missing comma
very slow Add LIMIT, INDEX or CTE on subquery.
Portfolio Projects:
- sales dashboard query: Joins the + window for top customers by revenue rank.
- churn analysis: Customers who have not had any orders in the last 90 days (DATE_SUB).
- A/B test: Group by type, average metrics, t-test ready export.
- python tie-in:query to csv →
pd.read_csv()That is, for.
Query the Kaggle dataset daily. Master these 17, and you’re interview ready.
- https://www.geeksforgeeks.org/sql/sql-data-analyse/
- https://www.linkedin.com/palse/copy-top-8-sql-queries-every-data-analyst-should-know-examples-pi9jf
- https://www.kdnuggets.com/sql-for-data-analysts-essential-queries-for-data-extraction-transformation
- https://www.linkedin.com/palse/sql-queries-youll-use-real-world-analytics-jobs-mryuf
Published via Towards AI
