Percona Live Data Performance Conference 2016

I’m sitting in the San Fransisco airport trying to get enough caffein in me to feel alive again.  A sure sign of a great conference.  Last year I live blogged the conference but this year there was so much stuff that I wanted to absorb that I failed to make time for that.  There were a ton of great sessions.  It did feel like there were fewer folks in attendance, but that only made the learning feel more personal.  This year the folks at Percona opened up the conference to encourage content across all platforms.  I wonder if that made some of the MySQL folks worry that the conference didn’t have anything to offer them.  I can tell you that there was plenty of great MySQL content.

Working in an enterprise database administration shop, the expanded content was perfect for me.  While we have historically only supported 2 platforms, we currently support 4 (about to be 5) and are constantly being pushed to support more.  This is a natural extension of modern application design and have a conference where we can hit on several platforms really helps.  We all have tight budgets these days and its almost like having 3-4 conferences in one.  But the soul of the conference is still very much MySQL.  With content geared towards first time DBAs through sessions on the InnoDB and the new MySQL Document Store, you can’t go wrong with this conference to expand your MySQL knowledge.

I am coming home with a bunch of great ideas.  The hard part now is to pick which things to work on implementing first.  Some will be changes my team can make ourselves and others will need to involve change in other teams as well.  I was very intrigued by the “chatops” idea.  I will try to see how we can integrate the concept into our operations.  I similarly was interested in the rise of integrating chat bots into database automation.  We are hitting database automation real hard right now.  Thus far our focus has been on self service APIs, but I think we will add a natural text interface into these APIs to further improve the experience for our customers.

Well, I am out of coffee so I need to run and refill the cup.  More to come on Percona Live and all the projects that come out of it.

It’s Been Awhile

We’ll life happens, and like everyone else I’ve been busy.  I have, however, been working on some exciting stuff so I thought it was time to get back to blogging to share all the cool stuff going on.  First, I have been working in management.  This means that I have to steal cycles to do the techie stuff these days.  Second, my son has started playing travel soccer which stills most of my early evenings and a good part of my weekends.  I do love to see him compete however, so no complaints here.  All this is to say that I have had to be more deliberate in what I have been working on.

I plan for this to be the first in a series of blogs and I thought it best to go over some of the topics that have been taking up my time.  I will list out both technical and non-technical issues that I have been wrestling with and provide a little context.  My hope is this information with help me to prioritize my subsequent blog posts as well as give some context to them when they come out.

MySQL

Last year, my team stepped up to the plate and went from zero to fully supporting Highly Available MySQL in about 9 months.  We spent time learning the platform, sharing knowledge and testing different configurations.  We also worked with Percona to have an independent expert come out and verify that we were on the right track.  When it was all said and done we not only have a handful of production applications on stand-alone MySQL instances and an application running on top of Galera, we also realize that MySQL provides us the ability to control data platform much better than our primary platforms (Oracle and Microsoft SQL Server).  My technical contribution to this project was providing a one day hands on Galera training for the team so they could better understand that technology.

MongoDB

After spending last year rolling out MySQL,  I had planned to spend this year cleaning up some remaining issues and working on standardizing our Oracle and SQL Server environments more.  Life has a way of changing my priorities.  Our company had a major initiative that required the use of MongoDB as the data store.  I had been reading up on MongoDB and had attended some local user group meetings, but as a team we were basically at ground zero as far as MongoDB was concerned.  Once again, we leaned on Percona to help us get some replica sets up in support of the aggressive project goals, and I had two of my folks work through the wonderful online training provided by MongoDB Inc.

We are still very much in the middle of this project, but so far I am excited by what I have seen.  I think that developers here are going to love somethings about MongoDB, and it will help us solve some issues we have in being more agile with our data stores.

Database as a Service

Another big initiative my team has going on right now is transitioning to a true Database as a Service provider.  We currently have a self-help website I built years ago that lets our customers spin up development and test databases, copy databases between instances, as well as deploy code without worrying about the documentation needed in the CMDB (the application handles all that).  It has served us well, but it was built without an API.  In order to integrate with other automation initiatives within the company we need to provide all this and more via an API.

We looked at Cloud Foundry and Trove as options to “buy” our way out to this.  But with all the other change going on we decided that it would be best to implement our own API and allowing others to plug in as needed.  This allows us to better control the back-end services and keep our existing investments in monitoring, backups, and other processes.  I am working with another member of my team to build a Restful API for this.  We have chosen to leverage NodeJS as our development platform.  For a “.Net” guy this is a steep learning curve, but so far I am digging it.

I’ll try to keep you up to speed on these areas.  I apologize in advance if post come at you in a seemingly random way from here on out.  I just hope to make the time to share and with all this going on the topics are bound to blend together.

 

More MySQL Auto Increment Fun

Originally Posted 10/21/2014

Last blog post I started to explore some of, what I found to be, the interesting behaviors surrounding the AUTO_INCREMENT attribute in MySQL.  With strict mode not set it does some really strange things.  Like allowing you to add the max value in the range over and over again as long as you delete the record that used to hold that value.  This could cause some pretty interesting issues in applications that are not storing their foreign key relationships in the database.  But I wanted to see how strict mode would impact the behavior.  So I rebuilt the environment using the following script file.

CREATE DATABASE IF NOT EXISTS moderndba;

 USE moderndba;

 DROP TABLE IF EXISTS autofun;

 CREATE TABLE IF NOT EXISTS autofun
 (
       employee_id        TINYINT SIGNED NOT NULL AUTO_INCREMENT
     , employee_name      VARCHAR(150) NOT NULL
     , employee_level     TINYINT UNSIGNED NULL
     , PRIMARY KEY (employee_id)
 );

 INSERT INTO autofun (employee_name, employee_level)
 VALUES
  ('Joe Smith', 0)
, ('Tim Bradford', 1)
, ('Betty Stafford', 1);

Notice, I changed the data type for the employee_id field from an unsigned int to a signed tiny int. This was just to make the math a little easier for me as this brings the max value in the range down to a easy to remember (and type) 127. Last time we saw that if you inserted a row and provided a value for employee_id at the end of the range you can not insert another row without specifying a value for employee_id that is not already used in the table. If you insert a row and do not provide an employee_id, or provide 0, or provide NULL, the row will not insert. But the error is not that the value is out of range. Instead the error is that there is a conflict in the primary key. This is because MySQL is noticing that the next value in the sequence is outside the range and picking the closest value in the range and trying to insert it. But, I know that STRICT_ALL_TABLES mode makes MySQL less forgiving with range values as well as in other ways. So, I ran the following to see what would happen with script mode turned on.

mysql> USE moderndba;
 Reading table information for completion of table and column names
 You can turn off this feature to get a quicker startup with -A

 Database changed
 mysql> SELECT *
     -> FROM autofun;
 +-------------+----------------+----------------+
 | employee_id | employee_name  | employee_level |
 +-------------+----------------+----------------+
 |           1 | Joe Smith      |              0 |
 |           2 | Tim Bradford   |              1 |
 |           3 | Betty Stafford |              1 |
 +-------------+----------------+----------------+
 3 rows in set (0.00 sec)

 mysql> SET sql_mode='STRICT_ALL_TABLES';
 Query OK, 0 rows affected (0.04 sec)

 mysql>  INSERT INTO autofun
     -> (employee_id, empoyee_name)
     -> VALUES
     -> (127, 'John Wiley');
 Query OK, 1 row affected (0.03 sec)

 mysql> SELECT *
     -> FROM autofun;
 +-------------+----------------+----------------+
 | employee_id | employee_name  | employee_level |
 +-------------+----------------+----------------+
 |           1 | Joe Smith      |              0 |
 |           2 | Tim Bradford   |              1 |
 |           3 | Betty Stafford |              1 |
 |         127 | John Wiley     |           NULL |
 +-------------+----------------+----------------+
 4 rows in set (0.00 sec)

 mysql> INSERT INTO autofun (empoyee_name)
     -> VALUES ('Amanda King');
 ERROR 1062 (23000): Duplicate entry '127' for key 'PRIMARY'

Oh well. So much for strict mode helping us get a better error message. It still is only chocking because of the duplicate key. I was hoping we would get a message that the next value was out of range, but no such luck.

Last time we saw that if you delete the last row and then perform an insert without specifying the employee_id that you will get the max value again. But while reading up more on the AUTO_INCREMENT attribute, I found this documentation on how it works on an InnoDB table. I thought this was interesting. So to test it our I ran the following statement to remove the last row in the table.

mysql> DELETE FROM autofun WHERE employee_id = 127;
 Query OK, 1 row affected (0.09 sec)

 mysql> SELECT *
     -> FROM autofun;
 +-------------+----------------+----------------+
 | employee_id | employee_name  | employee_level |
 +-------------+----------------+----------------+
 |           1 | Joe Smith      |              0 |
 |           2 | Tim Bradford   |              1 |
 |           3 | Betty Stafford |              1 |
 +-------------+----------------+----------------+
 3 rows in set (0.00 sec)

So last time when we performed the insert we would get an employee_id of 127. But what happens if the server bounces between the delete and the subsequent insert. To find out I exited out of the MySQL prompt and ran the following from the shell prompt.

[student@percona56 ~]$ sudo service mysql restart

Then after the restart I ran the following.

mysql> INSERT INTO autofun (empoyee_name)
     -> VALUES ('Amanda King');
 Query OK, 1 row affected (0.03 sec)

 mysql> SELECT *
     -> FROM autofun;
 +-------------+----------------+----------------+
 | employee_id | employee_name  | employee_level |
 +-------------+----------------+----------------+
 |           1 | Joe Smith      |              0 |
 |           2 | Tim Bradford   |              1 |
 |           3 | Betty Stafford |              1 |
 |           4 | Amanda King    |           NULL |
 +-------------+----------------+----------------+
 4 rows in set (0.00 sec)

So if the server is restarted the next value in the sequence is recalculated based on the maximum value stored in the column. This could be both beneficial (if you need to resolve the situation where you have removed some values you want to reuse) or not so beneficial (if you are not managing your foreign key relationships carefully). I still have a lot more questions about the AUTO_INCREMENT attribute in MySQL, but I think this is enough for the moment. I hope you are finding this behavior just as interesting as I am.

The Great Wall of Maintenance

Originally Posted 9/23/2104

1411458148The Great Wall. A symbol of strength that has lasted for centuries. But how did it last? Was it the engineering that allowed it to weather the test of time? Was it the native rock that allowed it to survive? Was it the loving care of individuals that allowed it to stay intact?

I never really thought about these questions before I visited the Mutianyu section of the Great Wall on a recent trip to Beijing. It was my first time to Beijing and I knew that a trip to the wall was a must. I had not had time to research before my trip (which is not like me), so I was not sure what to expect. I thought I would be impressed by the engineering, or the scenery. Instead, as I walked the wall and saw all the new bricks replacing the original granite, I started to think about the power of maintenance.

What does this have to do with the modern DBA? It is another reminder that all things need maintenance. As the pace of IT increases, and Database Administrators focus more and more on delivering systems, we need to remember that long term success only comes with the care and feeding of what has been entrusted to us. As we deliver database systems we need to ensure that maintenance plans are part of the deliverable. All systems need to have regular index and statistical maintenance to ensure that the database engine has the information it needs to generate efficient plans and minimize the resources needed to execute them.

And we can’t deliver these maintenance plans in a set it and forget it mind set. The Mutianyu section of the Great Wall follows the ridge line of a mountain chain. It was originally built using the granite quarried near by. But, along the way they switched from using the granite to bricks made from what appeared to be concrete. Why did they make the switch? I don’t know for sure. But I know that if they continued to repair the wall using the mountain itself, they would eventually erode the foundation on which the wall was built. Similarly, if you put plans in place and do not monitor them the ever shifting environment of IT can make the foundation of your maintenance unsound.

To ensure that you avoid this problem, you need to monitor how long your maintenance plans are taking. Are they still completing in your allotted windows? Also, you need to check in with your customers periodically to ensure that the windows for maintenance haven’t shifted or, as more likely the case, shrunk. Also, how effective is your maintenance? Do you track run times for important quires throughout the day? If not, this information can help you decide if maintenance can be run less frequently if your windows do shrink.

Spend the time to build an effective maintenance strategy and a plan to monitor your maintenance and who knows, your system may still be running in 600 years. You’ll be out of support, but it might just still be running.

The Changing Role of the Database Administrator

Originally Published 10/31/2014

I have been doing this Database Administration gig, in one form or fashion, for more years than I care to admit.  In that time I have seen a lot of change.  Change in the tools and platforms I have supported, and change in the role that the Database Administrator plays in the development and operation of software systems.  But, it has felt like the pace of change has really sped up over the last 2-3 years.  I thought it would be good to go over some of the changes I think will have the largest impact on the Database Administrator and what I think it will mean to our role.

The Cloud

Probably the biggest change is the adoption of the cloud.  Whether the cloud is public, private, or a hybrid cloud infrastructure is fundamentally changing the way that applications are designed and built.  A lot of people looked at the cloud early on, through the eyes of old development models, and felt like it would just be a way for small companies to compete with larger ones without having to have the financial resources to build and maintain their own data centers.  While that is one advantage of the cloud (one equally advantages to large companies), it is not the largest advantage the cloud has to offer.  The smart companies are realizing that if you rethink the way you architect and build applications you can achieve levels of elasticity in your application which allows you to meet the ever changing market without having to maintain infrastructure to meet peak demand all the time.

There are a lot of things that go into making applications capable of achieving this level of elasticity.  If you are interested in learning more, a great starting point is the The Twelve-Factor App.  But a key piece is to integrate via API and not in the database.  A lot of legacy applications rely on the database as an integration point.  But modern applications see the database as just a place to persist data.  In many ways this reduces the complexity of databases and makes it easier for contributors, other than the Database Administrator, to serve as the expert on the data storage.

Most cloud vendors have also added Database as a Service offerings to make it easier for companies without Database Administrators on staff to leverage database platforms.  These services offer installation, patching, backups, monitoring and other services that would traditionally be performed by Database Administrators.  These services combined with the change in how applications are being built make it easier for a project to get off the ground without the help of a Database Administrator.

NoSQL

NoSQL, or Not Only SQL, is the name for a class of data stores that do not store data in the tabular relationships associated with traditional Relational Database Management Systems (RDBMS).  Some popular examples include Redis,MongoDB, Couchbase, and Cassandra.  Each serves a specific data storage niche, and allow you to meet your data storage needs without the complexities of a full blown RDBMS.  But while you give up some of the features of a RDBMS you gain a lot in simplicity.  NoSQL platforms are typically much easier to maintain and have fewer “moving parts” than a traditional RDBMS.  This, along with their ability to easily scale horizontally, makes them very popular when building applications for the cloud.

I have not seen the whole sale movement of data from Relational Database Management Systems to NoSQL platforms, but I have seen a lot of new applications leverage NoSQL as the only, or at least primary, data store.  As a side note, I have also seen a lot of noise lately about applications moving off of NoSQL as the needs of the application change.  Like any other tool NoSQL has its role and the industry is still figuring out where it is best leveraged.

Big Data

Unless you have been hiding under a rock, you have probably heard the term Big Data.  What exactly is Big Data is still open for some debate.  But you can think of Big Data as problems involving data that is either too large or complex to analyse using traditional techniques.  The data is “big” because of it’s volume, velocity, variability, or complexity.  Volume is the term used to describe the quantity of observations being analyzed.  Velocity is the speed with which new observations are being generated and need to be processed.  Variability is what is used to describe the level of inconsistency in the observations being analyzed.  And complexity is the term used to describe the overall complexity in gathering the observations and combining them to make useful information.

The tool that most companies are moving to for Big Data problems is Hadoop.  One of the big advantages of Hadoop over traditional database platforms is that you need not define the structure of the data prior to it being stored.  Instead, you can define its structure at the time of analysis.  This makes the collection and storage of data much quicker from a process perspective than with a traditional data warehouse.  It also is easily scaled out by adding additional nodes to a cluster.  These capabilities easily solve the volume, variability, and complexity associated with Big Data.  Schema design has historically been a a service most Database Administrators have been involved with.  With Hadoop, these services are less in demand.

Much like NoSQL, Hadoop is a platform that does not require a Database Administrator in order for it to run.  Large environments do benefit from administration however, so if a Database Administrator is not responsible someone will probably need to step in.

The Future DBA

So, what does all of this mean for the Database Administrator of the future?  Well, I don’t have a crystal ball, but I think it is clear that the role will have to change.  First, many tasks that have traditionally been staples of the DBA role will either cease to be needed, or move primarily to developers or general system administrators.  Backups, monitoring, and patching will be either fully automated, or so easy that anyone will be able to “click the button” to perform the actions.  Database code deployment will no longer be the purview of the Database Administrator as the deployment process will be fully automated in order to allow for the elasticity that the cloud offers.  So, what will be left for the Database Administrator?

First off, I see all the major database platforms coming out with much better scale out stories over the next several years.  Developers are moving to NoSQL for the simplicity of development, but one big reason they stay is because of the ability to scale out.  I see the market moving back more towards traditional database platforms once the scale out story is a better one.  Current NoSQL platforms are good for certain thing, but non of them can compete with traditional database platforms for flexibility of handling diverse workloads.  Here is a great article comparing some common NoSQL platforms and how they server different purposes at the Weather Co.

Secondly, no matter the future of NoSQL, Database Administrators will still be needed to address the hard to automate aspects of managing data platforms for the foreseeable future.  The biggest one of these is performance.  Database Management Systems are complex pieces of software and it is still easy for performance to change without any changes to application or database code.  Ad systems grow in size or use the performance characteristics shift.  Database Administrators will still be needed to help developers and system administrators understand and solve these issues.

I hope you have found this post interesting and I would love to hear your feedback.  What services do you see the Database Administrator of the future offering?  What other factors do you see shaping the role of the DBA?

Moving to WordPress

So I started the ModernDBA blog on Weebly and it served me well in the beginning.  But I wanted a little more control than they provided, so I am trying out WordPress to see how I like it.  I’ll be working on moving my existing blog posts over to the new site over the next week or two.  I apologize in advance for any inconvenience as I work through this migration or content and domain name.