OpenWorld 2017: MySQL Automatic Diagnostics: System, Mechanism, and Usage

Shangshun Lei and Lixun Peng from Alibaba Cloud discussed in this session a system they have built called CloudDBA to automate a lot of traditional DBA roles.  The system sounds pretty cool and is what my team should be aspiring to do.  But there was not a lot of information on the how, and a lot of the what for CloudDBA.  My notes from the session are:

  • Why CloudDBA
    • Reduce Costs –
      • 80% of time spent on finding root cause, optimizing performance, scaling hardware and resources
      • 20% on database platform
    • Focus your resources on business
    • Provide best technology
  • Architecture
    • Kafka/JStorm for log collection
    • Offline Data repository
      • Error log
      • slow log
      • audit log
      • cpu/ios/status
    • Offline diagnostics
      • Top SQL Analyss
      • Trx Analysis
      • SQL Review
      • Deadlock Analysis
    • Online diagnostics
      • Knowledge base – rule engine
      • Inference Engine – matches for conditions and runs execution to resolve or provide advise to users
    • Realtime Event and advise
      • Slave delay
      • Config Tuning
      • Active Session
      • Lock and Transaction
      • Resource
  • Rule Engine
    • Immediate detection of useful changes with low cost
    • Choose correct inference model
      • Database global status is mature and easy to get
      • High frequency monitoring to make sure no useful info is missed
      • Real time state change detection algorithms
      • Importance of database experience
  • Knowledge Base and inference engine
    • Ability to accumulate DBA experts’ experience in short time
    • Accurate issue detection & corresponding advice
  • Offline diagnosis
    • Audit log does matter
    • Record full SQLs for database
    • A feature of AliSQL, no performance impact
  • Transaction analyiss
    • uncommitted transactions
    • long transactions
    • long interval between transactions statements
    • big transactions
  • SQL review
    • how many types of sql
    • how many types of transactions
    • sqls or sequence in transaction is expected or not
    • scan rows, return rows, elapsed time and sql advise
  • Top SQL
    • top sql before optimize
    • help explain questions such as why my cpu is 100%
    • different statistics dimensions and performance metrics
  • SQL Advisor
    • Not a database optimizer
    • built outside of MySQL kernel
    • query rewriter
    • follow rules to create indexes that works for the lowest cost

Performance Birds of a Feather – Percona Live 2015

This session was a talk with Rick James where we just talked about performance issues we in the audience were interested in.  (http://mysql.rjweb.org) My Notes are:

  • Start with InnoDB buffer pool at 70% of memory unless really small VM
  • Partitioning should not be done for performance
  • Stored procedures can provide better performance over views
  • If you are replicating then break up your deletes into small batches
  • We talked about ways to create shards to minimize impact
  • Slow Query log at 2 seconds or less and use pt-query-digest to analyze
  • Batch your inserts to have one insert statement with up to a 1,000 rows to add
  • For Galera you can make an argument that you can run innodb to 0 or 2 because a node can be rebuilt with SST
  • You can use INFORMATION_SCHEMA tables to build a blocking chain
  • subqueries can be poor performing because the optimizer is not smart enough to handle them
  • Try your best to turn them into a join unless you have a group by in the subquery to limit the rows
  • Also can create a temp table in memory and join to it instead of a subquery

I need to grab Rick’s information on Galera for developers.

Improving Performance with Better Indexes – Percona Live 2015

In the session titled “Improving Performance with Better Indexes” by Ronald Bradford from EffectiveMySQL went over a test case he had with a customer to improve the performance of an already indexed system.  Basically we learned how to use the USING INDEX information in the explain plan.  He has a 6 step process:

  • Capture
  • Identify
  • Confirm
  • Analyze
  • Optimize
  • Verify

Performance Schema provides 3 tables events_statements_…  you can use to identify queries.  Ronald also went over several other ways to log SQL Statements.  Then he discussed how to identify the queries to look at.  Slowest query is not the best place to start.  Instead you need to look at total duration for all queries with the same signature in a given time window.  This is the same issue and common mistake made by DBAs on all platforms.  He then went over the following commands:

  • EXPLAIN – The explain plan
  • SHOW CREATE TALBE – shows DDL to create a table
  • SHOW INDEXES FROM – shows statistics (cardinality only no histogram)
  • INFORMATION_SCHEMA.TABLES – shows size of data on disk
  • SHOW TABLE STATUS LIKE – also shows size of table on disk
  • EXPLAIN EXTENDED – adds more information to explain plan

Talked about optimized trace https://dev/mysql.com/doc/internals/en/optimizer-tracing.html very briefly.

When looking at explain key: NULL means it is not using an index.  row: tells you the number or rows compared.  Also, adds Using Index to the Extra:.  This means that we are using a covering index and are not having to go to the table to get information from the rows.  key_len tells us the number of bytes being used for comparison, so bigger the better in query tuning.  Covering indexes can be an issue when large columns are involved.  If it is too big then you will add I/O to use your indexes.  If they are small enough then you will be entirely in memory.

This session had some great information.  I will probably take it back and combine it with my SQL Server 12 step query tuning session to make a MySQL tuning 12 step presentation to give to our developers.