How DevOps Met Gaming Needs – Percona Live 2015

I attended the “How DevOps Met Gaming Needs” session by Sean Chighizola and Peter Garbes at Big Fish Games.  They talked about the migration process to a DevOps model.  They went over the history of Big Fish Games which started in 2002 with PC/Mac games.  They use Akimai for their delivery.  They use MySQL to run their site with physicall partitioning by feature/service.  They used a the following technologies, Nagios, cacti, php, perl, subversion, bash scripts, etc.  Their releases were site down deployments with all hands on deck.

The transition was hard to manage.  Change was difficult, there has to be a driver.  The addition of the free to play market was the driver for Big Fish.  They used Fairway Solitaire as an example of a game the made the transition from the digital download to the free to play model.  Even after the migration it would take 20-120 minutes to push out a change.  A change was pushed every day.

Big Fish moved their daily courses to a self-service process.  Was very successful.  But it was not all roses.  They had a lot of non standard changes that did not match their expected workflow.  They saw the DevOps model as a way to solve these residual pains.  They leveraged Jez Humble’s Continuous Delivery book as a guide.  They were able to increase velocity while maintaining team size.  They also were able to decrease failure rate.

They discovered that giving developers operational responsibility will make them think like operators.  They use MEM 3.0 to provide self-service with Query Analyzer being a key driver for that decision.  IT is not in the business of provisioning, we are in the business of engineering processes which put developers in control.  They also went over some MySQL issues they have run into.  I’ll need to grab their slides to get the links to dive in deeper.

They have been moving from Cacti to Graphite.  They have not yet implemented database changes but have a proof of concept in place.  They restrict the types of changes allowed and framework for where to put scripts.

Encrypting MySQL data at Google – Percona Live 2015

I attended the “Encrypting MySQL data at Google” session with Jeremy Cole and Jonas Oreland both from Google.  They started with the why of encrypting data.  The threat they are trying to protect from are:

  • access through network APIs (mysql client …) — not protecting against this
  • Access within from a running server (ptrace, memory dumping, etc.)– not protecting against this
  • Lost or misplaced disks — ARE PROTECTING
  • Backups — ARE PROTECTING

Not all threats are feasible to protect against.  An attacker with unlimited network access and time will be able own you.  And if they can get root they can own you.  You could encrypt data in the columns from the application but a lot of work and you cannot access your data via SQL.  It is also incompatible with 3rd party products.  You can purchase middleware products to provide column encryption.  (MyDiamo and CryptDB were given as examples).  Indexing because an issue with these approaches.  You could just encrypt all the disks but when mounted it would be unencrypted.

In their approach they wanted to encrypt all user data to include temporary tables and InnoDB data and log files.  They wanted to make drive-by data exfiltration impossible.  An attacker will need data and keys to decrypt.  InnoDB organized in 16K pages and they encrypt all pages except for page 0.  They do not encrypt page headers or page trailers.  They also do not use the same key to encrypt multiple similar pieces of data to avoid being able to make guesses on known attributes of InnoDB pages (i.e. lots of zeros in the middle of a page due to initialization).  They encrypt all log blocks but the first 4.  Each block is encrypted with a different key.

They have a Google key management infrastructure so you will need to role your own as theirs is not open source.  Keys are never stored on disk.  Only used in memory.  They did however public a key management plugin interface to allow for someone else to write an open source solution.  They rotate keys on the redo logs by writing dummy records to age out blocks as needed.  Temporary Tables should age themselves out within MySQL.  Binary and relay logs are encrypted by the latest key and similarly age out.  For InnoDB data they keep the key version in the page header and they have background threads that re-encrypt pages as needed to rotate keys.  The number of threads and how much IOPs should be used for rotation are both configurable.

The encryption works with replication as well.  Code available at code.google.com/p/google-mysql.

Key Notes Day 3 – Percona Live 2015

They announced Percona Live Europe in Amsterdam 21- 22nd September.  Call for papers is now open.  They also announced the Percona Live dates for next year as April 18-21 2016.  The conference will be back in Santa Clara.

Rob Young from Percona came up and discussed Percona’s Take on Smart Data Management (What’s Next for Products and Services).  They announced the formation of a customer advisory board to provide feedback to Percona.  They see that we need to evolve or die in the MySQL space.  Data has to perform and must provide flexibility.  It is no longer acceptable for the data to shape us.  We have to shape the data.  The most important thing is how compelling the application is not how maintainable it is.

Today’s data problems do not fit a single solution shaped hole.  The solution requires hard to find talent and skillsets.  Customers have to be the focus, not the technology.  Technology is just a means to an end.  He drew attention to the jbs board and how hard it is to find talent.  An example is craigslist which is using MySQL for active listings and MongoDB for their archives.  No longer is one solution able to effectively handle all data workloads.  Then Rob shared statistics showing the explosive growth of MongoDB.  He then announced the Percona is becoming the first MySQL and MongoDB Data Performance Company.  Their acquisition of TokuDB puts them in a great position to pull this off.  For information on TokuMX see http://www.tokutek.com/tokumx-for-mongodb/.

Rob then went over a high level overview of Percona Server 5.7.  They plan to integrate TokuDB into the next version of Percona XtraDB Cluster.  Then he went over Percona’s commitment to open source solutions for the MySQL and soon the MongoDB platforms.  Rob shared Percona’s plans to leverage Percona Cloud Tools to speed up and improve their consulting and support solutions.

Amrith Kumar from Tesora came up to discuss Building Databases for the “-as-a-Service” World.  I had attended his session on Trove earlier in the week.  And Trove has a lot of promise to improve DBAs scale of influence.  Amrith compared the conflict of design and assembly line workers that build cars with the development and operation groups inside of IT.  If you have two organizations in your organization and you have conflict then combine them.  This leads to DevOps.  But hiring for DevOps is like trying to hire a superhero.  The answer is automation.  Automation improves quality and drives down costs.  This frees up your engineers for higher level issues.  It allows DBAs to work more closely with developers and provide what they need to innovate.

OpenStack and Trove allow you to provide this type of Database automation.  Not only can you use Trove to automate it can also be used to provide self service.  IT also helps automate backups and maintenance.  Trove’s intends to be a full lifecycle automation framework for database services.  Tesora works with experts on different platforms to develop their automations.  Today they support a number of different data platforms both relational and nosql.  Amrith discussed the collaboration between Percona and Tesora to make Percona products easier to automate.  He really shared a compelling presentation on why automation is important and we will look at Trove more closely.

MySQL at Twitter – 2015 – Percona Live 2015

My last session of the day was “MySQL at Twitter – 2015” by 3 engineers at Twitter (Calvin Sun, Inaam Rana, and Anton Kuraev).  They mainly use MySQL for Timeline, Social graphs, users, and other uses for internal operations.  They use Master-Master replication but mostly Master-Slave.  They are running on Twitter WebScale 5.6.23.  They went over a lot of their improvements they have put into MySQL5.5.  Two of interest were auditing and query throttling.

Some of the changes are not practical for our workloads, but they have ported a lot of 5.7 features back into WebScale 5.6.  They showed that in an I/O bound cluster the node upgraded to 5.6 had much fewer page flushes.  They are working on a copy of write to improve incremental backups and provide consistent snapshots.  They also announced MySQL on Mesos named Mysos.  Mesos is a distributed kernel that makes for very large clusters of computers.  This looks pretty cool and is worth a deep dive after I get back from conference.

The Copy on Write has the potential to really help the MySQL backup story.

Advanced MySQL Query Tuning – Percona Live 2015

I attended the “Advanced MySQL Query Tuning” session by Alexander Rubin from Percona.  Alex just finished helping us out with a Galera evaluation and he is just top notch.  He also helps organize the Triangle MySQL Meetup which I attend.  The talk was specific to tuning Queries and did not cover configuration.  He went over GROUP BY, ORDER BY, DISTINCT queries and calculated fields (introduced in 5.7).

Indexes can be used to scan to leaf node on equality search.  It can also be scanned at the leaf level for range searches.  MySQL can use the leftmost part of combined indexes (pt_duplicate_index_checker can be used to look for duplicate indexes).  Alex showed how you can tell in the EXPLAIN if MySQL is only using part of the index based on key_len and the columns in the index.  MySQL cannot use a combined index if you are not using the leftmost part of the index.  A covered index covers all the columns in a query (WHERE and SELECT).  A covering index avoids row lookups.

MySQL cannot use an index if you use a calculated expression.  You will need to remove the expression to use an index.  You can also add a field to the table and create a trigger to keep it up to date and then you can use an index on that columns.  In 5.7.7 they add GATHERED field to create a derived column and they can be indexed.  The insert speed is 2x based on Alex’s example.  And he showed that the select does in fact use the index.  You can also define the derived table to be stored, but it increases the size of your table.

Temporary tables are slow and are created when you have group by, range + order by and in a few other cases.  There are both in memory and on disk temporary tables.  The tmp_table_size and max_heap_table_size configuration options are used to control spillage.  If the table is larger than either it will spill to disk.  Memory temporary table do not support BLOB/TEXT.  If you have a covering combined index it can avoid temporary table creation.

With a range scan the covering index will not help.  You can hack it by converting it to a union of multiple queries with covering indexes.  Still creates a temporary table but much much smaller.

When creating a covering index for filesort you need to start with the where condition and then the group by.

A loose index scan considers only a fraction of the key in an index.  The query cannot have joins, the group by should be the leftmost prefix of the index, only allows MIN() and MAX() aggregate functions.

Joining can cause duplicates and adding the DISTINCT can cause a slowdown.  By moving the DISTINCT to an outer query you can avoid spilling to disk if your select does not contain the columns with BLOG/TEXT.

In 5.7 Group By will no longer automatically cause an ORDER BY.  So Developers will need to add the ORDER BY if that is the desired result.

Big Transactions on Galera Cluster – Percona Live 2015

I attended the session “Big Transactions on Galera Cluster” Seppo Jaakola from Codership (folks that write Galera).

Huge Transactions are:

  • Large read set – no problems
  • Large Write Set roughly >10K rows modified can be slow
  • Long term transaction – ok but vulnerable for multi-master conflicts

A large transaction can cause flow control to kick in and prevent any transaction to commit before it finishes.  This is due to strict ordering of transactions to keep certification working deterministically.  Long term transactions are vulnerable to multi master writes.  Galera 3 and earlier can use wsrep_max_ws_rows and _size to limit the size of transactions.  2GB is upper limit for transaction size.

If you have a huge transaction be careful killing it because of rollback.  You may want to adjust flow control to avoid the other nodes from freezing.  Seppo did a demo showing the impact of huge transactions on the cluster.  Solutions could be to skip flow control using wsrep_desync.  This will fix the master, but slaves will just lag further and further behind.  You can also relax the commit order and allow the nodes to be temporarily out of sync (not supported by Galera just an idea).

Streaming replication helps and is added in Galera 4.  wsrep_trx_fragment_unit and wsrep_trx_fragement_size are used to configure.  This basically divides a huge transaction into a series of smaller transactions that will not interfere with Galera flow control.  Seppo set up streaming replication and then reran his demo showing the improvement of the cluster performance.  This will allow Galera to break the 2 GB write set limit by breaking up large transactions into smaller chunks.  These changes also help Galera to handle hot-spot transactions by setting transaction fragment size to 1 statement.

There are still issues with certification is large and small transactions are being written to multiple nodes, but this appears to be a huge improvement for Galera.  I look forward to playing with it in the future.  Galera 4 should be out first half of this year. Additional features Non blocking DDL. streaming replication, and optimized consistency alert shutdown (protocol to avoid cluster crash).

Managing MySQL with Puppet – Percona Live 2015

I attended the “Managing MySQL with Puppet” session at Percona Live 2015 with Jaako Pesonen from Spil Games.  He is using Diffy, Sheldon (provides information about environment, host, etc to Puppet), OpenStack, MySQL, Percona XtraDB Cluster, and Puppet.  In this session he concentrated on Puppet and Percona XtraDB Cluster.  Puppetlabs mysql is provided by Puppet labs.  It used to not be that good, but is getting better.  Puppet contains 4 mysql providors (server, client, bindings,and backups).  Here are my notes:

  • He recommends using Hiera to make puppet more useful and readable
  • https://github.com/spilgames/spilgames-mysql to get access to their module you can modify as needed (not set up for general use)
  • Nothing special needed for puppet and MHA or regular replication
  • Galera is special as that the first node is different
  • They use different users between production and staging for backups to avoid having production credentials in staging

There was a lot of puppet configuration slides that I will need to dive into more.  My takeaways is that we can manage Galera with Puppet, but it will take some customization.  I look forward to diving into Jaako’s GitHub repository and trying to figure out what will work best for us.

Better DevOps with MySQL and Docker – Percona Live 2015

So I ran into the Docker session “Better DevOps with MySQL and Docker” by Sunny Gleason from SunnyCloud late.  Docker can help you lay down images of your system but it is not the be all end all for DevOps.  Docker is a replacement for your Hypervisor.  Docker instead is a way to isolate workloads on a single Operating system.  You can run docker on top of a Hypervisor but it is not required.

Need to be careful with public images.  Hard to tell the good from the bad.  Docker working on a process to improve this.  Some of the continuing issues are managing the connectivity between containers and the physical environment.  Cross platform is also very new.  Based on Linux containers so security is still an issue.  Not really for multi tenant but for running multiple trusted solutions on the same hardware.  Deployment can be I/O intensive.

InnoDB Change Buffering – Percona Live 2015

My first real session of the day today at Percona Live was “InnoDB Change Buffering” with Davi Arnaut from LinkedIn.

The change buffer is used to batch changes to non-unique secondary indexes to avoid random I/O. This is a feature that has to be enabled and you can use innodb_change_buffering to

  • inserts
  • deleted
  • purges
  • changes (inserts and delete-mark)
  • all (default)
  • none

If the page is not in the buffer pool then the changes is buffered and it marks the buffer bitmap that the page has pending changes and then the changes are merged on read from disk.  The change buffer is actually just a separate index not exposed with page number as the index.  It is also smart enough to know that a change  will cause a page to be empty or to split and it will not buffer those changes.

Need to grab these slides and go over it as to keep up and take notes was not working very well.

Key Notes Day 2 – Percona Live 2015

After the standard house cleaning items, Steaphen Greene from Facebook came out to discuss WebScaleSQL.  WebScaleSQL is a collaboration project between a number of companies supporting large Web applications on MySQL (Alibaba, FaceBook, Google, LinkedIn, and Twitter).  This is a branch of MySQL modified by all the companies.  They also work with MariaDB, Oracle, and Percona on their development.  He gave a lot of praise to Oracle for their work on MySQL.  Some interesting features are:

  • Asynchronous MySQL Client
  • Lower InnoDB buffer pool memory footprint
  • Reduced double-write mode
  • Server-side query management features
    • Query throttling etc.
  • Logical read-ahead (faster backups)

Coming features

  • On-line InnoDB rebuilds
  • DocStore
  • MySQL and RocksDB

He then wrapped up with a discussion on the legal challenges of trying to get WebScale changes into the main MySQL code branch.

Then Patrik Sallner from MariaDB came up and gave a talk on how open source can help solve data management problems.  His slide on the data storage subway map was very interesting and worth trying to find.  He points out that open source provides for greater collaboration between companies and reduce the duplication of efforts.  He showed how MariaDB has included changes from a large list of companies into their code base.  One feature discussed of interest was the Connect Storage Engine which allows federation of data from other data platforms.  MariaDB 10.1 will have Galera in the base code so no separate binaries, activation is based on configuration.  10.1 will also have tablespace and table level encryption built in. My take away was that I need to look into Webyog and Sea Lion for monitoring.

After Patrik, Tomas Ulin from Oracle came out to celebrate the 20th anniversary of MySQL.  It is also been 10 years that Oracle has owned InnoDB and 5 owning MySQL.  5.7 Release candidate was announced last week.  Highlights of 5.7 are:

  • 2 x faster then 5.6
  • Enhanced InnoDB: faster online and bulk load operations
  • Replication Improvements: multi master multi threaded slaves (8-10x faster slaves)
  • New optimizer cost model: greater user control and better query performance
  • performance schema improvements
  • Query Rewrite Plugin
  • JSON support added (labs now but planned for GA)
  • Functional indexes (wasn’t clear if this was new)
  • SYS Schema improvements to help with monitoring and administration
  • GIS improvements
  • InnoDB improvements
    • Native partitioning
    • configurable fill factor
    • Buffer Pool resize online
    • Separate UNDO tablespace
  • Group Replication (probably not available for GA see labs.mysql.com)

Then he talked about MySQL Fabric 1.5.  The product sounds promising and we will take a look at this.  Also, Workbench 6.3 will have support for Fabric and many of these other features (just went RC).  Call for papers ends in two weeks for Open World.

Last but not least Robert Hodges from VMWare came out to talk about the cloud.  He talked about how VMWare is looking to build a public cloud that better integrates with private clouds that Enterprises already have.  The talk was basically a overview of the features of the newest version of VMWare.