OpenWorld 2017: MySQL Automatic Diagnostics: System, Mechanism, and Usage

Shangshun Lei and Lixun Peng from Alibaba Cloud discussed in this session a system they have built called CloudDBA to automate a lot of traditional DBA roles.  The system sounds pretty cool and is what my team should be aspiring to do.  But there was not a lot of information on the how, and a lot of the what for CloudDBA.  My notes from the session are:

  • Why CloudDBA
    • Reduce Costs –
      • 80% of time spent on finding root cause, optimizing performance, scaling hardware and resources
      • 20% on database platform
    • Focus your resources on business
    • Provide best technology
  • Architecture
    • Kafka/JStorm for log collection
    • Offline Data repository
      • Error log
      • slow log
      • audit log
      • cpu/ios/status
    • Offline diagnostics
      • Top SQL Analyss
      • Trx Analysis
      • SQL Review
      • Deadlock Analysis
    • Online diagnostics
      • Knowledge base – rule engine
      • Inference Engine – matches for conditions and runs execution to resolve or provide advise to users
    • Realtime Event and advise
      • Slave delay
      • Config Tuning
      • Active Session
      • Lock and Transaction
      • Resource
  • Rule Engine
    • Immediate detection of useful changes with low cost
    • Choose correct inference model
      • Database global status is mature and easy to get
      • High frequency monitoring to make sure no useful info is missed
      • Real time state change detection algorithms
      • Importance of database experience
  • Knowledge Base and inference engine
    • Ability to accumulate DBA experts’ experience in short time
    • Accurate issue detection & corresponding advice
  • Offline diagnosis
    • Audit log does matter
    • Record full SQLs for database
    • A feature of AliSQL, no performance impact
  • Transaction analyiss
    • uncommitted transactions
    • long transactions
    • long interval between transactions statements
    • big transactions
  • SQL review
    • how many types of sql
    • how many types of transactions
    • sqls or sequence in transaction is expected or not
    • scan rows, return rows, elapsed time and sql advise
  • Top SQL
    • top sql before optimize
    • help explain questions such as why my cpu is 100%
    • different statistics dimensions and performance metrics
  • SQL Advisor
    • Not a database optimizer
    • built outside of MySQL kernel
    • query rewriter
    • follow rules to create indexes that works for the lowest cost

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s