A short guide to MySQL database optimization

MySQL is a very popular open source database, but many install it and forget about it. Spending a little time on MySQL database optimization can reap huge returns …

In this article, I want to show you a couple of the first places you should head, when you need to pinpoint bottlenecks or tweak the MySQL configuration.

MySQL slow log

The slow log, will log any queries that take longer than a given number of seconds.

This can help you to identify poorly written or demanding queries.  You can then refactor them or use concepts like “indexes” to speed them up.

It’s often helpful to start with a high “long query time”, to just flag up the longest queries and then gradually reduce it, as you deal with each one in turn.

To enable the slow log, create the following /etc/mysql/mysql.conf.d/mycustomconfigs.cnf or add the following lines to your my.conf file…

[mysqld]
slow_query_log=true
long_query_time=1
slow-query-log-file=/var/log/mysql/mysql-slow

… then restart MySQL, to load in the new values.

Improve the query

Once you’ve found a slow query, it’s worth considering if there is a simpler way to get the same information.

If you can improve the performance of the query you might be able to skip looking into why the old one was slow.

Explain/Describe

If you’re still looking to improve your query, we need to dig into how MySQL is actually running the query and why it’s slow.  This will give us a better idea how to fix it.

For these examples I ran a few basic queries against this auto generated employee data.

Let’s suppose your slow query is:

SELECT AVG(hire_date) FROM employees WHERE emp_no IN (SELECT emp_no FROM dept_manager)

To see how it will be executed, prefix it with EXPLAIN:

mysql> EXPLAIN SELECT AVG(hire_date) FROM employees WHERE emp_no IN (SELECT emp_no FROM dept_manager);
+----+-------------+--------------+------------+--------+---------------+---------+---------+-------------------------------+------+----------+------------------------+
| id | select_type | table        | partitions | type   | possible_keys | key     | key_len | ref                           | rows | filtered | Extra                  |
+----+-------------+--------------+------------+--------+---------------+---------+---------+-------------------------------+------+----------+------------------------+
| 1  | SIMPLE      | dept_manager | NULL       | index  | PRIMARY       | PRIMARY | 8       | NULL                          | 24   | 100.00   | Using index; LooseScan |
| 1  | SIMPLE      | employees    | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | employees.dept_manager.emp_no | 1    | 100.00   | NULL                   |
+----+-------------+--------------+------------+--------+---------------+---------+---------+-------------------------------+------+----------+------------------------+
2 rows in set, 1 warning (0.00 sec)

It’s worth having a quick look at the documentation on the output format.  Three of the columns to check are:

  • key – is the index that will be used, the one you’d expect?
  • rows – is the number of rows to be examined as low as possible?
  • filtered – is the percentage of results being filtered as high as possible?

Suppose we’re regularly making the following query:

SELECT * FROM employees WHERE gender = 'M'

This has type: ALL meaning that all rows in the table will be scanned.

It therefore makes sense here to add an index.

After doing so, the type changes to ref – MySQL can simply return the rows matching the index rather than checking every row.

As you’d expect this halves the number of rows:

mysql> EXPLAIN SELECT * FROM employees WHERE gender = 'M';
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table     | partitions | type | possible_keys | key  | key_len | ref  | rows  | filtered  | Extra       |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| 1  | SIMPLE      | employees | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 299025 | 50.00    | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> CREATE INDEX gender ON employees(gender);
Query OK, 0 rows affected (0.97 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT * FROM employees WHERE gender = 'M';
+----+-------------+-----------+------------+------+---------------+--------+---------+-------+--------+----------+-------+
| id | select_type | table     | partitions | type | possible_keys | key    | key_len | ref   | rows   | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+--------+---------+-------+--------+----------+-------+
| 1  | SIMPLE      | employees | NULL       | ref  | gender        | gender | 1       | const | 149512 | 100.00   | NULL  |
+----+-------------+-----------+------------+------+---------------+--------+---------+-------+--------+----------+-------+
1 row in set, 1 warning (0.01 sec)

Third Party Tools

I should mention, that there are a bunch of tools that can help you find bottlenecks and really hone your MySQL database optimization techniques.

For websites written in PHP, we’re big fans of New Relic APM. This tool will allow you to sort pages based on their load time.  You can then dig deeper into whether the application code or database queries have the most room for improvement.

Once you’ve narrowed things down, you can start implementing improvements.

It’s worth having a search for other application monitoring providers to see if tools such as Datadog or DynaTrace better suit you.

MySQL Tuner

MySQL Tuner is a tool which looks at a database’s usage patterns to suggest configuration improvements.  Make sure you run it on a live system, that has been running for at least 24 hours.  Otherwise it won’t have access to enough data to make relevant recommendations.

Before you download and use the tool I’ll echo it’s warning:

It is extremely important for you to fully understand each change you make to a MySQL database server. If you don’t understand portions of the script’s output, or if you don’t understand the recommendations, you should consult a knowledgeable DBA or system administrator that you trust. Always test your changes on staging environments, and always keep in mind that improvements in one area can negatively affect MySQL in other areas.

Once you run MySQL Tuner, it will login to the database, read a bunch of metrics and print out some recommendations.

It’s worth grouping the related ones and reading up on any you haven’t come across.  After that you can test out changes in your staging environment.

One common improvement is to set skip-name-resolve.  This saves a little bit of time on each connection by not performing DNS lookups.  Before you do this make sure you aren’t using DNS in any of your grant statements (you’re just using IP addresses or localhost).

Your friendly SysAdmins

Of course, we are also here to help and regularly advise customers on changes that can be made to their infrastructure.

Give us a shout if you think we can help you too.

 

Feature image by Joris Leermakers licensed CC BY-SA 2.0.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published.