Key Notes Day 1 – Percona Live 2015

The morning session of key notes started off with some infomation on the conference.  Some of the interesting information included:

  • Over 1200 attendees, 22 from Facebook and 20 from Oracle
  • Percona Live Amsterdam Sept 21-22nd 2015 call for papers open now
  • 205 Speakers
  • 146 Breakout Sessions

Then Peter Zaitsev CEO of Percona came out and gave a great session on the history of MySQL.  With this year being the 20th anniversary of MySQL it was interesting to look back on how it has changed.  Then there was some information about the role that MySQL is playing today.  MySQL is leverage by the top 5 websites today and there is a lot of growth in the future with the move to the cloud and platforms like OpenStack.

After Peter was done then Harrison Fisk from Facebook came up on stage and gave a great overview on how they leverage multiple data platforms to handle different workloads.  Some noted from that presentation are:

  • Different data storage technology to handle varying data storage needs
  • MySQL used for small data also messaging and other systems
  • small data is 10’s of PB
  • get <5ms response times
  • flash storage and flash cache (hybrid flash and disk system built by Facebook)
  • 100’s of millions of QPS
  • HBase is used for small data as well (the messages themselves are in HBase for longer term storage)
  • HBase is used on disk and provides a great platform for archived data
  • HBASE 10’s of PB
  • HBASE Millions of QPS
  • HBASE < 50ms response time
  • RocksDB is an imbedded storage engine developed by Facebook (SQLLite replacement)
  • Used by a lot of internal systems 9B QPS at peak and is open source
  • Used by News feed
  • BigData
    • Hive
      • Minutes/Hours Queries
      • >300 PB logical size
      • >10PB Daily Processing
    • Presto
      • developed by Facebook
      • opensource distributed query engine in memory streaming
      • presto doesn’t handle failures well
      • Seconds/minutes Queries
      • >10K Queries Per Day (big analytic queries)
      • >30 Trilion rows per day
    • Scuba
      • Not open sourced
      • Real time analysis tool used for debugging at Facebook
      • Leaf aggregator in memory columnar store
      • 100s of TB
      • Many M rows ingest
    • New systems
      • RocksDB and MySQL – MySQL with RocksDB storage engine (Rocks is write optimized while InnoDB is read optimized)
      • Presto and MySQL – Sharded MySQL with Presto running on top
        • 100s of Millions of rows at once
        • Updatable all the benefits of MySQL

After Harrison was done then they had a panel discussion on the next disruptive technology after the cloud and big data.  The first question asked was “What’s the next big thing”?  Think about a datacenter as a single element of compute and remove the human element out of the management.  Application needs over technology will lead the way instead of technology leading the applications.  Choosing technology based on needs rather than trying to cram what we have to fulfill a need.  Internet of things is inevitable based on human needs to connect.  Technology is just an abstraction created by humans and it will follow.  Data security is going to be a big growth area.  IoT will drive this need when combined with the cheap costs of storage.  Singularity of information is going to come where the data silos will be broken down.

The question was asked how we would know the net thing is here.  Thomas Hazel from Deep Information Sciences had the answer that I identified with the most when he said “It will change our perspective, then it will change the business, then it will change our lives”.  Other ideas given were:

  • propriety technology will become niche
  • The next big thing will involve data
  • We are in the beginning of a data revolution
  • We can no longer be associated with a platform. Instead we have to be associated with services we provide.
  • The next big opportunity is turning all this data into useful information.
  • The core problem in data is analytics.
  • Data can be used to backup just about anything if we ask the wrong questions.  The big issue is how to ask the correct questions and know that we are doing so.
  • Data is becoming more personal where it used to be corporate.

Last they had a question and answer session with Steve Wozniak one of the founders of Apple.  He shared some great ideas about the direction of technology as well as the meaning of life.  He shared with us some of his concerns surrounding education and what it will take for computers to play a lead role in the education of our youth.  Computers today serve as a powerful tool, but we continue to educate our children very similarly to the way we did in the past.  Steve suggested this is because of a number of factors.  Steve said that the way that computers will take a lead role is to become more cognitive and human like.  People gravitate to humans to learn and until computers are more computer like they will remain as tools.

Steve also shared with us his two formulas for happiness.  His first formula was H=S-F or Happiness = Smiles – Frowns.  Later on he decide that H=F^3.  Happiness is equal to food times fun times friends.  Then Steve wrapped up by sharing information on his relationship with Steve Jobs.

Then just when you thought we were done there was a surprise announcement from Peter that Percona has acquired Tokutek a storage engine for MySQL and MongoDB.

All in all a great morning of information and a great kickoff to the main conference.

Getting Ready for Day 1 – Percona Live 2015

Waiting for Day 1 to kick off at Percona Live and I can’t help but reflect on the great day of training and networking we had yesterday.  This conference is a bit smaller than the ones I am used to, but the information has been top notch so far and the smaller size makes meeting and sharing ideas much easier.  Everyone I have met so far has been super nice and interested in sharing ideas and challenges.

Today we start the session portion of the event and so we’ll be covering more topics in less detail.  I have sessions lined up for the rest of the week across the MySQL space.  I will try to share as much as I can while still trying to keep up with all that is going on.

Percona XtraDB Cluster in Practice – Percona Live 2015

I knew right off this was going to be an awesome session.  When you sit down and the slide on the screen has instructions on how to set up an actual environment in Virtual Box you know it is going to be good.  The session was presented by Jay Janssen at Percona.  The session was broken up into two parts.  First we migrated a master slave environment to Galera and then we went over Galera issues and some admin information.

Most of the morning was spent setting up the environment by migrating from traditional replication to Galera.  I am sure there were some in the room who thought this was too basic, but it was great for me.  My team has been working on Galera for several weeks, but this was my first time putting my hands on it.  First we started with a 3 node asynchronous replication environment (1 master and 2 slaves).  We then replaced XtraDB server with XtraDB Cluster on one of the slaves.  Then we changed the configuration on that node to make it a single node cluster and started it with bootstrap.  At this point we had a one node Galera cluster getting transactions from the master.  Next we did the same on the second slave node and added it to the Galera cluster.  We then had to add a grant on the first Galera node in MySQL to allow backups to run so that the new Galera node could receive SST.  Then we did the same for the last node.  The tutorial is open source and available here https://github.com/percona/xtradb-cluster-tutorial.

After the hands on part we dived into Galera specific issues.  Highlights from that part of the talk were:

  • Need to look up the grant for /usr/bin/clustercheck and run it to allow the script to work
  • wsrep_cluster_address list of addresses that the node looks at to find and join its cluster in wsrep_cluster_name
  • Setting up gmcast.segment doesn’t magically convert the traffic to async between the two segments.  You will need to use traditional replication between one of the nodes on the cluster and a stand alone MySQL or one node on another cluster instead.
  • Synchronous part means that all nodes have the transaction and acknowledge that it has been received.   Then transaction is certified on each nodes.  Commit finalized node by node.  Commits are certified asynchronously on the other nodes.  Race conditions are handled by issuing a deadlock error while it is certifying.
  • Certification is deterministic.  If it passes on one node it will pass on all nodes.  Certification happens in order as commits happen on each node.
  • app needs to test for deadlocks on commit
  • first commit wins
  • locks are not replicated amongst other nodes because replication only happens on commit
  • there is a slave lag  This is not two phase commit.  No data loss on node failure
  • Slow nodes cause an issue because of flow control.
  • Replication stops during DDL to ensure that certification remains deterministic
  • Rolling Schema Upgrade mode allows you to upgrade one node at a time.  Takes node out of flow control and get behind and does not replicate the change. Then catches back up.  Same as master slave and change on slave first.  Cannot drop a column, rename a column, or insert a column in the middle.  All the issues with Alter table on replicated slaves exist with rolling schema upgrades. pt-online-schema-change fixes the issue but can cause and issue with foreign keys because of the renaming done by pt-osc which can cause foreign keys not to fire properly for a short period of time.
  • All Galera settings are node specific
  • Galera.cache does not persist past restarts.  So having multiple nodes down can cause SST if you don’t have a node up will they are down.
  • On some linux versions you can tell a node what donor node to use by running: service mysql start –wsrep-sst-door=node1 or you can set the option in config file start and then modify config file
  • Suspect timeout allows nodes to be evicted and a new quorum vote is taken based on the previous node count.  If greater than 50% then cluster is still primary (P in myq_status) otherwise non-primary (n in myq_status).
  • You can connect to a non-primary node but you cannot perform selects, DML, or DDL with the connection but you can look at variables.  You can bootstrap the node but now you have two clusters.
  • Writes are frozen during the timeout (10 seconds) when a node fails
  • The arbitrator node gets all the replication traffic.  This allows it to forward the traffic if the link between the nodes in the other data centers is down.  Need to take this into account when selecting the DC for the arbitrator node as latency and security of this matters.
  • If all nodes go down you will have to bootstrap the node with the highest GTID (cat /var/lib/mysql/grastate.dat).  Only way to cleanly shut it down is to stop the application from writing before you shutdown.
  • If the shutdown was not clean you will not have GTIDs in grastate.dat.  You will need to use “mysqld_safe –wsrep-recover”  GTID is the stuff after the : on the recovery position line.
  • You can avoid SST by using a backup to initialize a node if your galera.cache is sized log enough to contain the transactions since the backup used.  You will need to have used the –galera-info option with your backup and then use the file created by the backup to create a grastate.dat before you start mysql.
  • If cat /var/lib/mysql/grastate.dat shows 00000000-0000-00000-0000000 then it will SST just as if the file is missing.
  • If node won’t SST and the data dir is corrupt you need to remove the data dir and mysql install_db –user=mysql to create a clean empty data dir and then start mysql to SST.  Make sure you remove the .sst.
  • flush tables with read lock; will stall your cluster because of flow control.
  • You can tune flow control using gcs.fc_limit to adjust how deep the queue is before flow control kicks in.
  • The problem node with flow control is the node with the non zero queue depth.  Kill it or fix it and the cluster will resume.
  • wsrep_sync_wait will cause reads to wait on applies to happen before the read.  Slows down reads but ensures reads on all nodes are consistent.

I was able to keep up all morning and for most of the afternoon, but Jay lost me with the cluster check part and so I had to stop doing the hands on parts.  But we plan on using an F5 and I understand the use so I will plat with it more when I get back home.  I was able to fix the issue on the break.  I had miss heard what option to put in the mysqlchk xinetd config file for parameter type.  It needed to be UNLISTED not UNCHECKED as I had heard.  This allowed me to get it running on all the nodes.

So my brain is fried from all this.  I will definitely be playing with this more when i get back from conference.  So you might see more posts around this content.  Stay tuned for more from Percona Live and on Galera.

Whole Hog

Originally Posted 10/13/2014

8696935_orig

Like any good North Carolina boy, I appreciate good whole hog barbecue.  Some from the western part of the state would say that the only part of the pig worth barbecuing are the shoulders, but I’ve always been a whole hog kind of guy.  With some slaw and a few hush puppies, it doesn’t get any better.  While I love my barbecue whole hog, that is typically not the way I operate professionally.  As a Database Administrator I tend to be more conservative when it comes to the adaptation of new technologies.  That is not to say that I don’t like playing with new things.  It just means that I like to role out new technologies in a way that ensure early success for myself and the business.

Currently, I find myself in a situation where I have to put all my chips on the table.  The business has an application that has become critical to the business.  It was built on top of MySQL and based on some strategic decisions and time pressures it was decided that porting it to another platform was not the way to go.  So, while I am excited to be adding MySQL support with my team, I am also a bit nervous to be taking off such a large bite.  Starting off support for a new platform with such a critical and highly visible system is not the way I normally like to go.

I have a great team of DBAs that I work with.  Some of us already have varying degrees of experience with MySQL.  So I am confident that we will make this a success.  We have already started a MySQL training study group to get up to speed on MySQL administration.  We have also been installing and playing with MySQL in different High Availability configurations.  So while I am a bit nervous, I am not really worried.  I also feel that the Modern DBA needs to learn to let go and trust their training and experience.  The speed of IT is only increasing.  The technologies are ever changing.  We need to realize that sometimes the business needs us to adapt and take risks.  But we also need to realize that they want us to manage those risks.

I hope to keep you all up to date on how this project goes.  Until then, grab a fork and some nana pudding.  We are going whole hog.

The Great Wall of Maintenance

Originally Posted 9/23/2104

1411458148The Great Wall. A symbol of strength that has lasted for centuries. But how did it last? Was it the engineering that allowed it to weather the test of time? Was it the native rock that allowed it to survive? Was it the loving care of individuals that allowed it to stay intact?

I never really thought about these questions before I visited the Mutianyu section of the Great Wall on a recent trip to Beijing. It was my first time to Beijing and I knew that a trip to the wall was a must. I had not had time to research before my trip (which is not like me), so I was not sure what to expect. I thought I would be impressed by the engineering, or the scenery. Instead, as I walked the wall and saw all the new bricks replacing the original granite, I started to think about the power of maintenance.

What does this have to do with the modern DBA? It is another reminder that all things need maintenance. As the pace of IT increases, and Database Administrators focus more and more on delivering systems, we need to remember that long term success only comes with the care and feeding of what has been entrusted to us. As we deliver database systems we need to ensure that maintenance plans are part of the deliverable. All systems need to have regular index and statistical maintenance to ensure that the database engine has the information it needs to generate efficient plans and minimize the resources needed to execute them.

And we can’t deliver these maintenance plans in a set it and forget it mind set. The Mutianyu section of the Great Wall follows the ridge line of a mountain chain. It was originally built using the granite quarried near by. But, along the way they switched from using the granite to bricks made from what appeared to be concrete. Why did they make the switch? I don’t know for sure. But I know that if they continued to repair the wall using the mountain itself, they would eventually erode the foundation on which the wall was built. Similarly, if you put plans in place and do not monitor them the ever shifting environment of IT can make the foundation of your maintenance unsound.

To ensure that you avoid this problem, you need to monitor how long your maintenance plans are taking. Are they still completing in your allotted windows? Also, you need to check in with your customers periodically to ensure that the windows for maintenance haven’t shifted or, as more likely the case, shrunk. Also, how effective is your maintenance? Do you track run times for important quires throughout the day? If not, this information can help you decide if maintenance can be run less frequently if your windows do shrink.

Spend the time to build an effective maintenance strategy and a plan to monitor your maintenance and who knows, your system may still be running in 600 years. You’ll be out of support, but it might just still be running.

Managing the 3rd Party sysadmin curse in Microsoft SQL Server

Originally Posted 8/28/2014

An issue that has always come up with 3rd party applications that run on top of SQL Server is their insatiable desire to have sysadmin role access to the instance they run on.  It is an issue I have dealt with for as long as I can remember.  Historically, the documentation would ask for the access, but when it came down to it we always found a way that they could get by with lesser permissions (i.e., db_owner and maybe throw in some permissions in msdb to manage jobs).  But lately, we have been seeing more and more applications that are building sysadmin role checks into their installers and not moving forward without it.  Microsoft even requires it for System Center Operations Manager and other products.

My team is a cautious bunch, and we try our best to use shared instances as much as we can to keep down costs (we run a density of about 25 databases per instance).  Which means we are spending more time these days tracing installs and negotiating with vendors on permissions on these 3rd party applications.

For those of you who came to this page looking for the silver bullet, I am sorry that is not the purpose of this post.  Instead, I am interested to here what you all have tried to manage this issue.  I but together a quick light hearted survey to try and see how most people are handling this issue.  I am also interested to hear if anyone has found a great way to deal with this issue.  The survey is hosted by SurveyMonkey.

The Froogle DBA – Building CMS based on Extended Properties

Originally Posted 8/24/2014

Before I dive into the next component in the Froogle DBA’s toolbox, I wanted to give you all an idea of what’s to come.  In the original Froogle DBA plan on which my SQL Saturday presentation was built there are three component involved in automatically documenting the managed environment in Central Management Servers (CMS).  I already went over Extended Properties and showed you how to add them to a database.  I spoke a little about the advantage of using Extended Properties to document your environment.  I even talked about using CMS to query the Extended Properties to answer some common questions.

While I was procrastinating from working on my presentation it dawned on me that not everyone will want to run the other piece of the Froogle DBA trinity, Management Data Warehouse (MDW).  A lot of people will already have monitoring tools in place, and MDW is probably the hardest piece to scale and manage.  So I thought, how could people take advantage of tighter integration between CMS and Extended Properties without having to run MDW.  So I spent some time today an through together a PowerShell script that can be used to populate groups in CMS based on Extended Properties you have defined on your databases.

The idea is that you will have a group inside of CMS that lists all the Instances of SQL Server you manage.  This script will query all the Extended Properties defined at the database level on these instances.  It will then look for those that start with a specific Prefix (the default is “CMS “).  It will then build a group for each of these properties under the root of the CMS, removing the prefix,  So, for example the Extended Property “CMS Application” would create a group in CMS called “Application”.  It will then build a sub-group for each unique value defined for that Extended Property.  Then lastly it will register all the instances of SQL Server that had that value under the sub-group.  For example, if you have a database on Instance1 and another on Instance2 with the Extended Property “CMS Application” set with the value of “Application 1” you would see the results in the following picture.

2161216_orig

Now I need to fully disclose that this is a script that I have not run in production.  So, please verify this in a test environment before turning it loose on production.  And, I would be very interested on any feedback you might have on the script.  Did you find it useful?  Did you need to modify it to fit in your environment?  Have ideas for enhancements?

Well, I guess it’s time to reveal the script.

#=========================================================================
# Filename:     PopulateCMSFolders.ps1
# Created on:   20140824
# Created by:   John Greeson
#
# Description:  This script will query all the databases belonging to
#               instances in the "Instance List" group and look for 
#               Extended Properties starting with "CMS ".  It will then 
#               build supporting folders in CMS based on those variables if
#               they are missing.  It will empty all the folders it found
#               and then repopulate them with the list of subfolders and 
#               instances if found in the query of Extended Properties.
#
# Requirements: Inorder for this script to run the CMS Server needs to be
#               registered in SSMS by the user running the script on the 
#               machine it will run on.  This means if you want to automate 
#               the task you will need to determine what windows account
#               the automated task will run as, log onto the machine it 
#               will run as as that user, and register the CMS Server in 
#               SQL Server Management Studio.
#
#  The MIT License (MIT)
#
#  Copyright (c) 2014 John Greeson
#
#  Permission is hereby granted, free of charge, to any person obtaining a copy
#  of this software and associated documentation files (the "Software"), to deal
#  in the Software without restriction, including without limitation the rights
#  to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
#  copies of the Software, and to permit persons to whom the Software is
#  furnished to do so, subject to the following conditions:
#
#  The above copyright notice and this permission notice shall be included in
#  all copies or substantial portions of the Software.
#
#  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
#  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
#  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
#  AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
#  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
#  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
#  THE SOFTWARE. 
#
#=========================================================================


#=========================================================================
#                          VARIABLES TO CHANGE
#=========================================================================

# Central Management Server
$CMSInstance = "SQLMON";

# Master List of Instances
$MasterInstanceList = "Instance List";

# Extended Property Prefix
$Prefix = "CMS ";

#=========================================================================
#                                THE MAGIC
#=========================================================================

Import-Module SQLPS -DisableNameChecking;

# Create data table to store Extended Properties
$table = New-Object system.Data.DataTable “ExtendedProperties”
;
# Add Columns to data table
$col1 = New-Object system.Data.DataColumn Instance,([string]);
$col2 = New-Object system.Data.DataColumn Connection,([string]);
$col3 = New-Object system.Data.DataColumn Database,([string]);
$col4 = New-Object system.Data.DataColumn Name,([string]);
$col5 = New-Object system.Data.DataColumn Value,([string]);

$table.Columns.Add($col1);
$table.Columns.Add($col2);
$table.Columns.Add($col3);
$table.Columns.Add($col4);
$table.Columns.Add($col5);

# Set path to the CMS list containing the master list of Instances
$pathToListOfInstances = "SQLSERVER:\SQLRegistration\Central Management Server Group\" + $CMSInstance + "\" + $MasterInstanceList;

# Get list of instances
$instanceList = get-childitem $pathToListOfInstances -recurse;

# Gather All Extended Properties
foreach($instance in $instanceList)
{
    $instanceName = $instance.Name;
    $instanceConnection = $instance.ServerName;

    $server = new-object ("Microsoft.SqlServer.Management.Smo.Server") $instanceConnection;

    foreach($database in $server.Databases)
    {
        foreach($property in $database.ExtendedProperties)
        {
            $row = $table.NewRow();
            $row.Instance = $instanceName;
            $row.Connection = $instanceConnection;
            $row.Database = $database.Name;
            $row.Name = $property.Name;
            $row.Value = $property.Value;
            $table.Rows.Add($row);
        }
    }
}

# Set path to the CMS root
$pathToCMSRoot = "SQLSERVER:\SQLRegistration\Central Management Server Group\" + $CMSInstance;

# Conditionally populate groups
$rootGroups = $table | where Name -like $($Prefix + "*") | Group-Object -Property Name;
foreach($rootGroup in $rootGroups)
{
    $groupName = $rootGroup.Name;

    # Strip off the tag from the Extended Property
    $firstLevelGroup = $groupName.Substring($Prefix.Length);

    # Define path to the first level group
    $groupPath = $pathToCMSRoot + "\" + $firstLevelGroup;
 
    if(!(Test-Path $groupPath))
    {
        mkdir $groupPath;
    }

    # Empty out groups we manage
    Remove-Item $($groupPath + "\*") -Recurse;

    # Create Subgroups
    $subGroups = $table | where Name -eq $groupName | Group-Object -Property Value;
    foreach($subGroup in $subGroups)
    {
        # Define path to the first level group

        $groupPath = $pathToCMSRoot + "\" + $firstLevelGroup + "\" + $subGroup.Name;

        mkdir $groupPath;

        # Add instances to subgroup
        $instances = $table | where Name -eq $groupName | where Value -eq $subGroup.Name | sort-object -Property Instance -Unique;

        foreach($inst in $instances)
        {
           $con = $inst.Connection;
           New-Item -Name $inst.Instance -path $groupPath -ItemType Registration -Value ("Server=$con; integrated security=true");
        }
    }
}

RDBMs Platforms and Airline Seat Categories

So I just finished a weeklong trip to Northern Italy with the family. This was our first time bringing my children to Italy (My wife and I have been twice before, and I used to live in Italy for 3 years while in High School). I am also late in a process at work of bringing on support for an Highly Available MySQL solution. This effort at work is being performed by a group I lead which is made up of Microsoft SQL Server and Oracle DBAs that was recently formed out of two separate teams. As I was on the plane ride home (and trying hard to stay awake) I began to think of the differences between MySQL and SQL Server and Oracle.

Admittedly, we are not through the implementation phase in our MySQL rollout, but we have spent the past year researching and have engaged partners to help us architect and build out a highly available infrastructure for MySQL. As I sat in Economy Class with my wife and 3 children (the last two international trips my wife and I have taken were business class) I started to think that there is a lot of parallels between the 3 platforms and the 3 classification of travel on most international flights (Economy, Business, and First classes).

First, let me start by upsetting some of my SQL Server DBA brethren and say that Oracle is still the first class option in the RDBMS space. The price also reflects this reality. While the starting point for Oracle and Enterprise Edition of SQL Server are not that far apart, Oracle will get to a whole nother level quickly as you add HA and performance features that are already baked into the Enterprise Edition of SQL Server. But for that extra money you get options that just aren’t available in SQL Server (i.e. RAC and Exadata).

SQL Server fills the business class space in this analogy. Some would like to argue with me on this and talk about the value. But really, with all the recent changes to the Microsoft Licensing model for SQL Server pushing you into buying Software Assurance and putting all the HA options in Enterprise and deprecating Mirroring in Standard Edition I think I can win the argument that while there is still a lot of value, it is much more of a business class value point than an economy class one.

Just as in business class, where you pay a bit more for your ticket but all your drinks are included and the product is a bit more polished, buying SQL Server Enterprise Edition also covers 99% or potential needs and is very polished. I still have concerns about Microsoft’s trajectory for SQL Server and what it will do to the product long term, that is for a different post. So that leaves us with MySQL.

So far, I would say that MySQL is very much the economy class in this situation. First, I don’t see MySQL as being free (as in free beer) as some would argue. The support costs still exist. To be successful you will need to wither invest in training an/or support and this has a cost. Second, MySQL has matured to the point where it will get you to the same place as Oracle and SQL Server. In fact, I can make a good argument that MySQL (with Galera) provides a level of protection and performance that is not available in the other two if architected and used correctly. But, just as you have to work to sneak the extra drinks and snacks in economy class, there is far more work in architecting, building, and supporting these highly available environments in MySQL.

MySQL however provides you much more control over your costs. You can choose how much to invest in external vs. internal support. Companies like Facebook and Google pay to have experience on staff able to not only support MySQL but to also modify the code base. Most companies do not need this level of in house experience. Especially with so many companies now providing support for MySQL. For most all that is needed is a trained DBA and access to support from one of these vendors.

Well, that was my great revelation on the flight back from Europe. Now time to go try to sneak some champagne from Business class.

The Changing Role of the Database Administrator

Originally Published 10/31/2014

I have been doing this Database Administration gig, in one form or fashion, for more years than I care to admit.  In that time I have seen a lot of change.  Change in the tools and platforms I have supported, and change in the role that the Database Administrator plays in the development and operation of software systems.  But, it has felt like the pace of change has really sped up over the last 2-3 years.  I thought it would be good to go over some of the changes I think will have the largest impact on the Database Administrator and what I think it will mean to our role.

The Cloud

Probably the biggest change is the adoption of the cloud.  Whether the cloud is public, private, or a hybrid cloud infrastructure is fundamentally changing the way that applications are designed and built.  A lot of people looked at the cloud early on, through the eyes of old development models, and felt like it would just be a way for small companies to compete with larger ones without having to have the financial resources to build and maintain their own data centers.  While that is one advantage of the cloud (one equally advantages to large companies), it is not the largest advantage the cloud has to offer.  The smart companies are realizing that if you rethink the way you architect and build applications you can achieve levels of elasticity in your application which allows you to meet the ever changing market without having to maintain infrastructure to meet peak demand all the time.

There are a lot of things that go into making applications capable of achieving this level of elasticity.  If you are interested in learning more, a great starting point is the The Twelve-Factor App.  But a key piece is to integrate via API and not in the database.  A lot of legacy applications rely on the database as an integration point.  But modern applications see the database as just a place to persist data.  In many ways this reduces the complexity of databases and makes it easier for contributors, other than the Database Administrator, to serve as the expert on the data storage.

Most cloud vendors have also added Database as a Service offerings to make it easier for companies without Database Administrators on staff to leverage database platforms.  These services offer installation, patching, backups, monitoring and other services that would traditionally be performed by Database Administrators.  These services combined with the change in how applications are being built make it easier for a project to get off the ground without the help of a Database Administrator.

NoSQL

NoSQL, or Not Only SQL, is the name for a class of data stores that do not store data in the tabular relationships associated with traditional Relational Database Management Systems (RDBMS).  Some popular examples include Redis,MongoDB, Couchbase, and Cassandra.  Each serves a specific data storage niche, and allow you to meet your data storage needs without the complexities of a full blown RDBMS.  But while you give up some of the features of a RDBMS you gain a lot in simplicity.  NoSQL platforms are typically much easier to maintain and have fewer “moving parts” than a traditional RDBMS.  This, along with their ability to easily scale horizontally, makes them very popular when building applications for the cloud.

I have not seen the whole sale movement of data from Relational Database Management Systems to NoSQL platforms, but I have seen a lot of new applications leverage NoSQL as the only, or at least primary, data store.  As a side note, I have also seen a lot of noise lately about applications moving off of NoSQL as the needs of the application change.  Like any other tool NoSQL has its role and the industry is still figuring out where it is best leveraged.

Big Data

Unless you have been hiding under a rock, you have probably heard the term Big Data.  What exactly is Big Data is still open for some debate.  But you can think of Big Data as problems involving data that is either too large or complex to analyse using traditional techniques.  The data is “big” because of it’s volume, velocity, variability, or complexity.  Volume is the term used to describe the quantity of observations being analyzed.  Velocity is the speed with which new observations are being generated and need to be processed.  Variability is what is used to describe the level of inconsistency in the observations being analyzed.  And complexity is the term used to describe the overall complexity in gathering the observations and combining them to make useful information.

The tool that most companies are moving to for Big Data problems is Hadoop.  One of the big advantages of Hadoop over traditional database platforms is that you need not define the structure of the data prior to it being stored.  Instead, you can define its structure at the time of analysis.  This makes the collection and storage of data much quicker from a process perspective than with a traditional data warehouse.  It also is easily scaled out by adding additional nodes to a cluster.  These capabilities easily solve the volume, variability, and complexity associated with Big Data.  Schema design has historically been a a service most Database Administrators have been involved with.  With Hadoop, these services are less in demand.

Much like NoSQL, Hadoop is a platform that does not require a Database Administrator in order for it to run.  Large environments do benefit from administration however, so if a Database Administrator is not responsible someone will probably need to step in.

The Future DBA

So, what does all of this mean for the Database Administrator of the future?  Well, I don’t have a crystal ball, but I think it is clear that the role will have to change.  First, many tasks that have traditionally been staples of the DBA role will either cease to be needed, or move primarily to developers or general system administrators.  Backups, monitoring, and patching will be either fully automated, or so easy that anyone will be able to “click the button” to perform the actions.  Database code deployment will no longer be the purview of the Database Administrator as the deployment process will be fully automated in order to allow for the elasticity that the cloud offers.  So, what will be left for the Database Administrator?

First off, I see all the major database platforms coming out with much better scale out stories over the next several years.  Developers are moving to NoSQL for the simplicity of development, but one big reason they stay is because of the ability to scale out.  I see the market moving back more towards traditional database platforms once the scale out story is a better one.  Current NoSQL platforms are good for certain thing, but non of them can compete with traditional database platforms for flexibility of handling diverse workloads.  Here is a great article comparing some common NoSQL platforms and how they server different purposes at the Weather Co.

Secondly, no matter the future of NoSQL, Database Administrators will still be needed to address the hard to automate aspects of managing data platforms for the foreseeable future.  The biggest one of these is performance.  Database Management Systems are complex pieces of software and it is still easy for performance to change without any changes to application or database code.  Ad systems grow in size or use the performance characteristics shift.  Database Administrators will still be needed to help developers and system administrators understand and solve these issues.

I hope you have found this post interesting and I would love to hear your feedback.  What services do you see the Database Administrator of the future offering?  What other factors do you see shaping the role of the DBA?