Chapter 4: Understanding SQL Data Types

In the previous chapter, you learned how to create tables and define columns. When defining a column, one of the most critical decisions you make is choosing its data type. A data type tells the database what kind of values a column can hold (e.g., numbers, text, dates) and how much storage space it requires. Choosing the correct data type is essential for data integrity, efficient storage, and optimal query performance. This chapter will explore the most commonly used SQL data types in MySQL/MariaDB, focusing on their purpose and best practices for web development.

Why Data Types Matter

  • Data Integrity: Ensures that only valid data of the expected kind is stored in a column. For example, you can’t accidentally put text into a column designed for numbers.
  • Storage Efficiency: Different data types consume varying amounts of storage space. Choosing the smallest appropriate type can save disk space, which is crucial for large databases.
  • Performance: Databases are optimized to work with specific data types. Using the correct type can significantly speed up data retrieval and manipulation operations.
  • Validation: Data types provide a basic level of validation. If you try to insert data that doesn’t match the column’s data type, the database will often throw an error, preventing invalid data from entering your system.

Common SQL Data Types (MySQL/MariaDB Focus)

SQL data types are broadly categorized into Numeric, String, Date/Time, and Boolean types.

1. Numeric Data Types

Used for storing numbers, including integers, decimals, and floating-point numbers.

  • INT (Integer)
    • Purpose: Stores whole numbers (no decimal places). Ideal for IDs, counts, quantities.
    • Range: Typically from -2,147,483,648 to 2,147,483,647 (signed 4-byte integer).
    • Example Use: user_id, product_id, stock_quantity, age.
    • Syntax Example: column_name INT
    • Variations: TINYINT (very small integer, e.g., for booleans 0-1), SMALLINT, MEDIUMINT, BIGINT (for very large numbers). Use the smallest INT type that fits your range to save space.
  • DECIMAL(P, S) (Fixed-Point Number)
    • Purpose: Stores exact numeric values with a fixed number of decimal places. Essential for financial data where precision is critical (e.g., prices, currency).
    • Parameters:
      • P (Precision): Total number of digits allowed (both before and after the decimal point). Maximum is typically 65.
      • S (Scale): Number of digits allowed after the decimal point.
    • Example Use: price, tax_rate, discount_percentage.
    • Syntax Example: price DECIMAL(10, 2) (can store numbers up to 99,999,999.99)
    • Note: Avoid FLOAT or DOUBLE for monetary values due to potential floating-point inaccuracies.

2. String Data Types

Used for storing text, characters, and binary data.

  • VARCHAR(size) (Variable-Length String)
    • Purpose: Stores variable-length strings of text. The size parameter specifies the maximum number of characters that can be stored. This is the most commonly used string type for web data.
    • Size: Up to 65,535 characters, but practical limits apply based on row size.
    • Example Use: username, email, product_name, short_description, URL.
    • Syntax Example: username VARCHAR(50), email VARCHAR(100)
    • Tip: Choose a size that is reasonable for the data you expect. It only consumes space for the actual characters stored, plus 1 or 2 bytes for length information.
  • TEXT (Long String)
    • Purpose: Stores longer blocks of text, typically for content that can exceed VARCHAR limits.
    • Size: Up to 65,535 characters (64KB).
    • Example Use: product_description, blog_post_content, comment_body.
    • Variations: TINYTEXT, MEDIUMTEXT, LONGTEXT for even larger amounts of text.
  • CHAR(size) (Fixed-Length String)
    • Purpose: Stores fixed-length strings. If the stored string is shorter than size, it’s padded with spaces. If it’s longer, it’s truncated.
    • Size: Max 255 characters.
    • Example Use: Postal codes (if fixed length), country codes (e.g., ‘US’, ‘GB’), UUIDs.
    • Syntax Example: country_code CHAR(2)
    • Note: Less common than VARCHAR unless you have genuinely fixed-length data, as it uses more space for shorter strings.

3. Date and Time Data Types

Used for storing dates, times, or combinations of both.

  • DATE
    • Purpose: Stores a date (year, month, day).
    • Format: YYYY-MM-DD.
    • Example Use: birth_date, order_date, event_date.
    • Syntax Example: event_date DATE
  • TIME
    • Purpose: Stores a time (hour, minute, second).
    • Format: HH:MM:SS.
    • Example Use: opening_time, meeting_time.
    • Syntax Example: start_time TIME
  • DATETIME
    • Purpose: Stores a combination of date and time.
    • Format: YYYY-MM-DD HH:MM:SS.
    • Example Use: created_at, updated_at, last_login.
    • Syntax Example: post_timestamp DATETIME
  • TIMESTAMP
    • Purpose: Similar to DATETIME, but stores the number of seconds since the Unix epoch (January 1, 1970, UTC). Often used for tracking record changes. Automatically updates when a row is modified if configured correctly.
    • Format: YYYY-MM-DD HH:MM:SS (MySQL displays it in this format).
    • Example Use: last_modified, session_start.
    • Syntax Example: last_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
    • Note: DATETIME has a wider range than TIMESTAMP (which has a practical limit around 2038). For general purpose dates and times, DATETIME is often preferred unless you specifically need the automatic update feature of TIMESTAMP or its smaller storage footprint.

4. Boolean Data Type

SQL doesn’t have a direct BOOLEAN data type in MySQL/MariaDB.

  • TINYINT(1)
    • Purpose: Commonly used to represent boolean values (TRUE/FALSE). 1 is typically TRUE, 0 is FALSE.
    • Example Use: is_active, is_admin, email_verified.
    • Syntax Example: is_active TINYINT(1) DEFAULT 0
    • Note: While MySQL allows BOOLEAN as an alias, it internally converts it to TINYINT(1).

Best Practices/Tips for Choosing Data Types

  • Be Specific: Always choose the most specific (and usually smallest) data type that can accommodate the data you expect. Don’t use VARCHAR(255) if VARCHAR(50) is sufficient. Don’t use INT if TINYINT works.
  • Match Data to Type: Ensure your data type logically matches the data you’re storing (e.g., DECIMAL for money, DATE for dates, VARCHAR for names).
  • Consider Future Growth: While being specific is good, also consider potential future growth of data. If a username might realistically grow beyond 50 characters, allocate more space from the start or plan for an ALTER TABLE.
  • Hashing Passwords: Always store passwords as hashes (e.g., using PHP’s password_hash() function). A VARCHAR(255) is typically sufficient for storing these hashes. Never store plain text passwords.
  • Timestamps: For created_at and updated_at columns, DATETIME with DEFAULT CURRENT_TIMESTAMP and ON UPDATE CURRENT_TIMESTAMP (for updated_at) are very common patterns to automatically track when records are created or last modified.

SQL Syntax & Examples

Let’s refine our table creation with a focus on data types, using scriptbuzz_db.

First, ensure you are using the correct database:

SQL
USE scriptbuzz_db;

Example 1: Re-creating users table with specific data types

(Assuming you’ve dropped it or are creating it fresh)

SQL
-- Drop the table if it already exists (for clean re-creation)
DROP TABLE IF EXISTS users;

CREATE TABLE users (
    user_id INT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(50) NOT NULL UNIQUE,       -- Max 50 chars, required, must be unique
    email VARCHAR(100) NOT NULL UNIQUE,         -- Max 100 chars, required, must be unique
    password VARCHAR(255) NOT NULL,             -- Max 255 chars for password hashes, required
    is_admin TINYINT(1) DEFAULT 0,              -- 0 for false, 1 for true, defaults to 0
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP, -- Auto-set current timestamp on creation
    last_login DATETIME                         -- Can be NULL if user hasn't logged in yet
);

Explanation: We added is_admin to demonstrate TINYINT(1) for boolean flags and last_login which can be NULL.

Example 2: Re-creating products table with specific data types

SQL
-- Drop the table if it already exists (for clean re-creation)
DROP TABLE IF EXISTS products;

CREATE TABLE products (
    product_id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,                 -- Max 100 chars, required
    category VARCHAR(50),                       -- Max 50 chars, optional (can be NULL)
    description TEXT,                           -- For longer text, optional
    price DECIMAL(10,2) NOT NULL,               -- Up to 99,999,999.99, required
    stock_quantity INT NOT NULL DEFAULT 0,      -- Whole number, required, defaults to 0
    is_active TINYINT(1) DEFAULT 1,             -- 0 for inactive, 1 for active, defaults to 1
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP, -- Auto-set current timestamp on creation
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP -- Auto-update on modification
);

Explanation: We added category (optional VARCHAR), is_active (a boolean flag), and updated_at which automatically updates its timestamp whenever the row is modified.

To run these examples:

  1. Open phpMyAdmin or MySQL Workbench.
  2. Select scriptbuzz_db.
  3. Copy and paste the DROP TABLE IF EXISTS and CREATE TABLE statements for users, then execute.
  4. Repeat for products.
  5. Check the “Structure” or “Columns” tab for each table to verify the data types and constraints.

Notes:

  • Choosing the correct data type for each column is crucial for data integrity, storage efficiency, and performance.
  • Numeric Types:
    • INT: For whole numbers (IDs, quantities).
    • DECIMAL(P,S): For precise numbers with decimals (currency, prices).
  • String Types:
    • VARCHAR(size): For variable-length text (names, emails). Most common.
    • TEXT: For long blocks of text (descriptions, content).
  • Date/Time Types:
    • DATE: For dates only (YYYY-MM-DD).
    • DATETIME: For date and time (YYYY-MM-DD HH:MM:SS).
    • TIMESTAMP: Similar to DATETIME, often used for tracking last modified times.
  • Boolean: Use TINYINT(1) (0 for false, 1 for true) in MySQL/MariaDB.
  • Always strive to use the most specific and smallest data type possible for your needs.

Small Exercise

Imagine you need to add a table for blog posts to scriptbuzz_db.

  1. Design the posts table schema in your mind or on paper, considering the following information:
    • A unique ID for each post.
    • The title of the post.
    • The actual content of the post.
    • The ID of the user who authored the post (referencing user_id from the users table).
    • The date and time the post was created.
    • The date and time the post was last updated.
    • A flag indicating if the post is published or a draft.
  2. Write the CREATE TABLE SQL query for this posts table, selecting appropriate data types and constraints (PRIMARY KEY, NOT NULL, DEFAULT, AUTO_INCREMENT, TINYINT(1) for the published flag). Don’t worry about foreign keys yet for user_id.
  3. Execute your CREATE TABLE query in phpMyAdmin or MySQL Workbench within scriptbuzz_db to verify it works.

🚀 Explore Popular Learning Tracks