Chapter 10: Aggregate Functions
So far, you’ve focused on retrieving individual rows or specific sets of rows from your tables. But what if you need to perform calculations on a group of rows? For example, you might want to know the total number of users, the average price of all products, the most expensive product, or the number of items currently in stock.
This is where aggregate functions come in. Aggregate functions perform a calculation on a set of rows and return a single summary value. They are incredibly powerful for generating reports, statistics, and overview information for your web application dashboards.
What are Aggregate Functions?
Aggregate functions operate on a collection of values (usually from a column) and return a single summary value. They are often used with the GROUP BY
clause (which we’ll cover in the next chapter) to perform calculations on specific groups of data, but they can also be used on the entire table.
The most common aggregate functions are:
COUNT()
: Counts the number of rows.SUM()
: Calculates the sum of values in a numeric column.AVG()
: Calculates the average value of a numeric column.MIN()
: Finds the minimum value in a column.MAX()
: Finds the maximum value in a column.
SQL Syntax & Examples
Let’s use our scriptbuzz_db
and the data we’ve populated. Remember to run USE scriptbuzz_db;
first.
1. COUNT()
: Counting Rows
COUNT()
is used to count the number of rows that match a specified criterion.
Syntax:
SELECT COUNT(column_name) FROM table_name WHERE condition;
SELECT COUNT(*) FROM table_name WHERE condition;
COUNT(*)
: Counts all rows, including those withNULL
values.COUNT(column_name)
: Counts only the rows wherecolumn_name
is notNULL
.
Example 1: Total number of users
SELECT COUNT(*) AS total_users FROM users;
Expected Output (example):
total_users |
3 |
What it does: Returns the total count of all rows in the users
table. We use an alias AS total_users
to give the result column a meaningful name.
Example 2: Number of products currently active
SELECT COUNT(product_id) AS active_products_count FROM products
WHERE is_active = 1;
Expected Output (example, based on previous product insertions):
active_products_count |
6 |
What it does: Counts the product_id
(which is never NULL) for all products where is_active
is 1
.
Example 3: Number of products with a description (i.e., description is not NULL)
SELECT COUNT(description) AS products_with_description FROM products;
Expected Output (example, based on previous product insertions):
products_with_description |
3 |
What it does: Counts only the rows where the description
column has a non-NULL
value. Compare this to COUNT(*)
, which would count all products regardless of whether they have a description.
2. SUM()
: Summing Numeric Values
SUM()
calculates the total sum of values in a specified numeric column. It ignores NULL
values.
Syntax:
SELECT SUM(numeric_column) FROM table_name WHERE condition;
Example 4: Total value of all products in stock
SELECT SUM(price * stock_quantity) AS total_inventory_value FROM products;
Expected Output (example):
total_inventory_value
7965.00
(Calculated as: (120050) + (25.5200) + (99.9930) + (39100) + (650) + (35020) + (30150) + (7540) = 60000 + 5100 + 2999.7 + 3900 + 0 + 7000 + 4500 + 3000 = 86499.7 – wait. Let me recalculate with the output products given.
- Laptop Pro: 1200 * 50 = 60000
- Wireless Mouse: 25.50 * 200 = 5100
- Mechanical Keyboard: 99.99 * 30 = 2999.70
- USB-C Hub: 39.00 * 100 = 3900
- Webcam 1080p: 65.00 * 0 = 0
- Gaming Monitor: 350.00 * 20 = 7000
- USB Keyboard: 30.00 * 150 = 4500
- Gaming Mouse: 75.00 * 40 = 3000
Total: 60000 + 5100 + 2999.70 + 3900 + 0 + 7000 + 4500 + 3000 = 86499.70.
Corrected Expected Output:
total_inventory_value |
86499.70 |
What it does: Calculates the sum of (price multiplied by stock quantity) for all products, giving you the total value of your inventory.
3. AVG()
: Calculating Average Value
AVG()
calculates the average value of a specified numeric column. It ignores NULL
values.
Syntax:
SELECT AVG(numeric_column) FROM table_name WHERE condition;
Example 5: Average price of all active products
SELECT AVG(price) AS average_active_product_price FROM products
WHERE is_active = 1;
Expected Output (example, based on active products: 1200, 25.5, 99.99, 39, 350, 30, 75):
average_active_product_price |
278.4271428571 |
(Calculated as (1200 + 25.5 + 99.99 + 39 + 350 + 30 + 75) / 7 = 1849.49 / 7 = 264.2128571428571) |
Corrected Expected Output (using the 7 active products from the product list):
average_active_product_price |
264.212857 |
What it does: Sums the prices of all active products and divides by the count of active products.
4. MIN()
: Finding the Minimum Value
MIN()
finds the smallest value in a specified column. It ignores NULL
values.
Syntax:
SELECT MIN(column_name) FROM table_name WHERE condition;
Example 6: Cheapest product price
SELECT MIN(price) AS cheapest_product_price FROM products;
Expected Output (example):
cheapest_product_price |
25.50 |
What it does: Scans the price
column and returns the lowest value found.
5. MAX()
: Finding the Maximum Value
MAX()
finds the largest value in a specified column. It ignores NULL
values.
Syntax:
SELECT MAX(column_name) FROM table_name WHERE condition;
Example 7: Most expensive product price
SELECT MAX(price) AS most_expensive_product_price FROM products;
Expected Output (example):
most_expensive_product_price |
1200.00 |
What it does: Scans the price
column and returns the highest value found.
6. Combining Aggregate Functions with WHERE
Aggregate functions can be combined with the WHERE
clause to perform calculations on a filtered subset of data.
Example 8: Number of inactive products
SELECT COUNT(*) AS inactive_products FROM products
WHERE is_active = 0;
Expected Output (example):
inactive_products |
1 |
What it does: First filters the products
table for inactive items, then counts the rows in that filtered set.
Tips
- Aliases are Key: Always use
AS
to give your aggregated columns clear, descriptive names. This makes the query results much more understandable, especially when displayed in a web application. NULL
Values: Remember thatSUM()
,AVG()
,MIN()
, andMAX()
functions ignoreNULL
values.COUNT(column_name)
also ignoresNULL
s, whileCOUNT(*)
includes all rows. Be aware of this behavior when interpreting results.- Performance: Aggregating data, especially on large tables, can be resource-intensive. For frequently run aggregate queries, consider adding appropriate indexes or pre-calculating and storing aggregate values if they don’t change often.
Notes:
- Aggregate functions perform calculations on a set of rows and return a single summary value.
COUNT(*)
: Counts all rows.COUNT(column)
: Counts non-NULL
values in a column.SUM(numeric_column)
: Calculates the total sum.AVG(numeric_column)
: Calculates the average.MIN(column)
: Finds the smallest value.MAX(column)
: Finds the largest value.- Aggregate functions are often combined with the
WHERE
clause to perform calculations on filtered data. - Use
AS
to alias the resulting aggregate columns for clarity.
Small Exercise
Using your scriptbuzz_db
:
- Find the total number of products in your
products
table. - Calculate the average
stock_quantity
for all products. - Determine the email address of the user who has the lowest
user_id
(assuming lower ID means older user). - If you have an
orders
table, calculate the totaltotal_amount
of all orders, and also find the maximumtotal_amount
of a single order.
You’ve now added powerful analytical capabilities to your SQL toolkit! Next, we’ll learn how to apply these aggregate functions to groups within your data, which is where things get even more interesting.