The SQL Update Statement Explained: Queries for Updating Tables (including MySQL Examples)

The SQL UPDATE statement is one of the core data manipulation tools available to developers and DBAs. It allows you to modify existing data stored in database tables. While inserting new data and querying existing information are common tasks, the ability to update rows is equally important for keeping information current and accurate over time.

In this comprehensive guide, we‘ll dive deep into the UPDATE statement and explore its full range of capabilities. We‘ll start with the basics of updating a single column and progressively work up to advanced techniques like multi-table updates, handling NULLs, cascading updates, and more. Along the way, we‘ll discuss performance considerations, potential pitfalls, and plenty of practical examples in MySQL.

The Anatomy of an UPDATE Query

At its simplest, an UPDATE query targets a single table and specifies a new value for one or more columns. Let‘s break down the basic syntax:

UPDATE table_name
SET column1 = value1, 
    column2 = value2,
    ...
WHERE condition;

The key parts are:

  1. The name of the table you want to update after the UPDATE keyword
  2. A comma-separated list of column assignments after SET, specifying the new value for each column
  3. Optionally, a WHERE clause to filter which rows are updated (more on this later)

So an update query to change a user‘s email might look like:

UPDATE users 
SET email = ‘[email protected]‘
WHERE id = 147;

Updating Multiple Columns

You can set new values for multiple columns in the same query by comma-separating the assignments:

UPDATE products
SET price = 9.99,
    status = ‘SALE‘,
    updated = CURRENT_TIMESTAMP 
WHERE sku = ‘XYZ321‘;

This query updates three columns—price, status, and updated—for a specific product identified by its SKU. Note the updated column is being set to the current date/time using the CURRENT_TIMESTAMP function.

The Power of the WHERE Clause

Leaving off the WHERE clause from an UPDATE query is a recipe for disaster, as it will update every single row in the table! Unless you really intend to change all rows, always include a condition after WHERE to target only specific records.

The WHERE clause works just like it does in a SELECT query. You can use any columns, operators, and functions to craft a condition that identifies the rows you want:

-- Update using equality
UPDATE orders SET status=‘SHIPPED‘ WHERE id=1234;

-- Update using LIKE 
UPDATE products SET active=0 WHERE name LIKE ‘%Clearance%‘;

-- Update using a range
UPDATE users SET type=‘inactive‘ WHERE lastLogin < ‘2023-01-01‘;

These examples respectively update:

  1. A single order by its ID
  2. All products whose name contains "Clearance"
  3. Users who haven‘t logged in since the start of 2023

Using Expressions and Functions

You‘re not limited to directly specifying values in the SET clause. The real power of UPDATE comes from using expressions and functions to derive the new value.

Want to increase all employee salaries by 5%?

UPDATE employees SET salary = salary * 1.05;

How about converting names to proper case?

UPDATE customers 
SET name = CONCAT(UPPER(SUBSTRING(name, 1, 1)), 
                  LOWER(SUBSTRING(name, 2)));

This query uses a combination of UPPER, LOWER, SUBSTRING and CONCAT to capitalize the first letter of each name while making the rest lowercase.

Joining Tables for Complex Updates

Sometimes the information you need to determine the new values comes from another table. No problem – you can join tables in an UPDATE query just like you would in a SELECT.

For example, let‘s say we want to update student grades based on their test scores stored in a separate table:

UPDATE enrollments e
JOIN tests t ON t.student_id = e.student_id AND t.course_id = e.course_id  
SET e.grade = (CASE WHEN t.score >= 90 THEN ‘A‘
                    WHEN t.score >= 80 THEN ‘B‘  
                    WHEN t.score >= 70 THEN ‘C‘
                    WHEN t.score >= 60 THEN ‘D‘
                    ELSE ‘F‘ END)
WHERE t.date = (SELECT MAX(date) FROM tests 
                WHERE student_id = t.student_id AND course_id = t.course_id);

There‘s a lot going on here, so let‘s break it down:

  1. We‘re updating the enrollments table aliased as e
  2. We join to the tests table aliased as t to access test scores
  3. The join condition matches tests to enrollments by student and course
  4. We use a CASE expression in the SET to convert numeric scores to letter grades
  5. The WHERE clause uses a correlated subquery to only consider each student‘s most recent test in each course

While complex, this single query replaces what would otherwise be a cumbersome process of looping through enrollments, looking up scores, and updating grades row by row.

Updating with Subqueries

In the previous example we used a subquery in the WHERE clause, but subqueries can be used in the SET clause too. This is useful when you need to update one table based on values from another table.

Let‘s update the price of each product to the average price of products in its category:

UPDATE products p
SET price = (SELECT AVG(price) FROM products 
             WHERE category = p.category);

For each row, the subquery calculates the average price of products in the same category, which is then used as the new price value.

Performance Considerations

While UPDATE queries are generally fast, there are a few things to keep in mind for optimal performance:

  1. Indexes: Just like with SELECT queries, indexes can dramatically speed up UPDATE queries. If you‘re updating a small number of rows based on an indexed column, the query will be very fast. But if you‘re updating a large portion of the table or using a non-indexed column in your WHERE clause, the query may be slower as MySQL has to scan the entire table.

  2. Row Locking: When an UPDATE query is run, MySQL takes an exclusive lock on each row it modifies. This prevents other queries from modifying those same rows and ensures data consistency. However, if many UPDATE queries are trying to modify the same rows simultaneously, they may block each other leading to slowdowns. In InnoDB, row-level locking is used which is generally very efficient, but in MyISAM, table-level locking is used which can cause contention.

  3. Updating Large Numbers of Rows: If you need to update a very large number of rows (say, millions), it‘s often more efficient to break the update into smaller chunks. You can do this by adding a LIMIT clause to the query and running it multiple times, increasing the offset each time until all rows are updated. This allows other queries to run in between and prevents the update from hogging system resources.

UPDATE large_table SET column1 = value1 WHERE id > 0 AND id <= 1000000; 
UPDATE large_table SET column1 = value1 WHERE id > 1000000 AND id <= 2000000;
-- etc.

Updating Across Multiple Tables

In some cases, you may want to update columns in multiple tables at once. MySQL supports this with a special multi-table UPDATE syntax:

UPDATE table1, table2, ... 
SET table1.column1 = value1, 
    table2.column2 = value2, ...
WHERE condition;

A common use case is updating a fact table and a dimension table in a star schema:

UPDATE sales s
JOIN products p ON s.product_id = p.id
SET s.product_name = p.name,
    s.product_category = p.category
WHERE p.updated > s.updated;  

This query updates the product name and category in the sales fact table whenever the corresponding product dimension has newer information.

Handling NULL Values

NULL values can sometimes throw a wrench in UPDATE queries. Here are a few tips:

  1. To set a column to NULL, use SET column = NULL.

  2. To update a column only if it‘s currently NULL, you can use the IS NULL condition:

UPDATE contacts 
SET email = ‘[email protected]‘
WHERE email IS NULL;
  1. If you‘re setting a column based on an expression that could evaluate to NULL, you can use COALESCE or IFNULL to provide a fallback value:
UPDATE students
SET grade = COALESCE(grade, ‘N/A‘);

This sets grade to ‘N/A‘ if it would otherwise be set to NULL.

ON UPDATE CASCADE

When you have tables related by foreign keys, you often want changes to automatically propagate from the parent table to the child table. That‘s where ON UPDATE CASCADE comes in.

Let‘s say we have a users table and an orders table. Each order is associated with a user via a user_id foreign key. If a user‘s ID changes, we‘d like their associated orders to be updated automatically.

First, we set up the foreign key constraint with ON UPDATE CASCADE:

ALTER TABLE orders
ADD CONSTRAINT fk_user_id
FOREIGN KEY (user_id) REFERENCES users(id)
ON UPDATE CASCADE;

Now, if we update a user‘s ID:

UPDATE users SET id = 1001 WHERE id = 1;

MySQL will automatically cascade this change to the orders table:

SELECT * FROM orders WHERE user_id = 1; 
-- No results

SELECT * FROM orders WHERE user_id = 1001;
-- Shows all orders previously associated with user 1

This ensures referential integrity and saves you from having to manually update the child table.

Pitfalls and Edge Cases

While UPDATE is generally straightforward, there are a few situations that can trip you up:

  1. Updating the Same Row Twice: If your UPDATE query matches a row multiple times (e.g., via a self-join), MySQL will update it multiple times which may not be what you want. To avoid this, you can add a LIMIT 1 clause to the query.

  2. Self-Joins in UPDATE: Speaking of self-joins, they can be tricky in UPDATE queries because you‘re reading from and writing to the same table. Make sure your join conditions are airtight or you may get surprising results.

  3. Subquery Returning Multiple Rows in SET: If you use a subquery in the SET clause, it must return only one row. If it returns multiple rows, you‘ll get an error. You can use an aggregate function like MAX() or MIN() to ensure a single value.

Advanced Examples

To round out our discussion, here are a few more advanced UPDATE techniques:

  1. Updating JSON Columns:

    UPDATE products 
    SET properties = JSON_SET(properties, ‘$.weight‘, 12.3) 
    WHERE sku = ‘ABC123‘;

    This updates the "weight" property of a JSON column using the JSON_SET function.

  2. Updating with Window Functions:

    UPDATE sales s
    JOIN (SELECT product_id, AVG(price) AS avg_price
       FROM sales
       GROUP BY product_id) a ON s.product_id = a.product_id
    SET s.price_diff = s.price - a.avg_price;

    This calculates the difference between each sale‘s price and the average price for that product, using a window function.

  3. Conditional Updates with CASE:

    UPDATE orders
    SET status = CASE
                 WHEN DATEDIFF(CURRENT_DATE(), order_date) > 30 THEN ‘Archived‘
                 WHEN DATEDIFF(CURRENT_DATE(), order_date) > 7 THEN ‘Completed‘
                 ELSE status
              END;

    This updates the status of orders based on how old they are, using a CASE expression.

In Summary

The UPDATE statement is a versatile tool for modifying existing data in your SQL database. Whether you‘re updating a single value, refreshing multiple columns based on complex criteria, cascading changes across related tables, or transforming data with advanced functions, UPDATE can handle the task.

As a data professional, it‘s crucial to understand not just the mechanics of UPDATE, but also the performance implications, potential pitfalls, and best practices surrounding its use. Judicious use of indexes, chunking of large updates, awareness of locking behavior, and careful handling of NULLs and self-joins will help ensure your UPDATE queries are efficient and effective.

Moreover, staying up-to-date with advanced techniques like JSON updating, window functions, and conditional expressions will keep your skills sharp and your queries elegant.

Remember: with great power comes great responsibility. Always consider the safety of your data first. Backup regularly, test carefully, use transactions judiciously, and think twice before running UPDATE without a WHERE clause.

Armed with the knowledge from this guide, you‘re well-equipped to make the most of UPDATE in your SQL arsenal. So go forth and update with confidence!

Similar Posts