OpenWorld 2017: InnoDB: What’s new in 8.0

Sunny Bains from Oracle

  • Legacy – Multiple Data Dictionaries (.frm & InnoDB DD)
    • changes not atomic
    • mismatch possible
    • concurrent access had to be managed
    • not crash proof
  • 8.0 is transitional so InnoDB will have mutex for this reason but will go away in future releases
  • New Data Dictionary
    • Required for transactional DDL (future)
    • stated in InnoDB
    • Control meta-data access using single locking mechanism
    • no .frm files for temporary tables – meta-data in memory only
    • Improves table spaces by removing .frm files
    • Stored in new InnoDB system tables
    • single set of persisted metadata for all storage engines
    • makes atomic DDL possible
  • InnoDB serves 2 roles
    • Data Dictionary store for all storage engines
    • implements atomic DDL (DDL_Log)
  • InnoDB gets metadata from the server not from system tables
  • DDL_Log
    • post commit is when physical deletes happen
    • during process the intent to delete is recorded in the DDL_Log
    • stored inside InnoDB
    • no direct user access
  • Performance
    • Cost Based Optimizer statistics
      • Number of pages in RAM per index
    • Remove the buffer pool mutex (Percona contribution)
      • Took a long time to fix problems in the contributed patch
      • QA team found lots of problems in edge cases
      • Foundation for more improvements in the future
    • CATS (Contention Aware Transaction Scheduling)
      • contributed by University of Michigan DB researchers
      • No configuration required
      • Switches between FIFO and CATS automatically
        • threshold is >-= 32 waiting threads
    • group records by table id when purging
      • reduces contention of dict_index_t::lock when multiple purge threads enabled
      • –innodb_deadlock_detect := bool (dynamic)
        • on high concurrent workloads deadlock detector becomes expensive so this turns it off and rely on rollback
      • –innodb_stats_include_delte_marked : bool
        • include/exclude rows marked as deleted
    • memcache improvments
      • support multiple get and range search
    • persistent auto increment
      • doesn’t rest to SELECT MAX(AUTOINC_COL) FROM T; on restart
      • probably the most requested feature since v3.x
      • Bug 199 – created 27 March 2003
    • new INFORMATION SCHEMA tabel INNODB_CACHED_INDEXES
      • pages cached in the InnoDB buffer pool cache
    • Ability to manage Undo table space
      • Add Drop undo table space
      • default of 2 undo table space required
      • undo truncate on by default
    • Avoid intermediate commits that would occur every 10000 rows
    • removed .sl files(InnoDB Symbolic Link files)
    • –innodb-read-only semantics change
      • if ON then affects entire MySQL instance
      • Because DD tables are stored in InnoDB
    • Better tablespace management
      • versioning of tablespaces
        • improves upgrade process
        • support multiple tablespace/page/row foramts
        • helps with maintaining backward compatibility
        • easier to introduce new features/capabilities
          • ie page/row formats that give better compression
    • Serialized Dictionary Information (SDI)
      • Tablespaces self describing
      • json file
      • improved import/export
      • tools for extracting
      • online offline
      • useful during disaster recovery
    • New In-Memory storage engine (temptable)
      • for internal use only
      • not shared across connections
      • lifetime limited to query life time
      • limited size, bound by ram used
        • –temptable-max-ram
    • More flexible BLOB handling
      • allow partial fetch and update
      • plan to make streaming easier
    • encrypt redo and undo logs
    • descending indexes
    • NOWAIT & SKIP_LOCK
    • Dedicated server
      • –innodb-dedicated-server (default OFF)
      • sets default value basedon physical memory available
      • dynamically sets the following variables (UNIX only)
        • –innodb-log-file-size
        • –innodb-buffer-pool-size
        • –innodb-flush-method
    • Scalable redo log
      • Dedicated redo log threads
    • Log buffer can be resized online
    • Scalable IO layer
  • Movement aware of static variables – everything should be dynamic

OpenWorld 2017: MySQL 8.0: What’s New in Security

Mike Frank and Georgi Kodinov from Oracle gave a deep dive into security changes in MySQL 8.0.  A lot of the features are only available in MySQL Enterprise and I tired to keep it clear in my notes which required Enterprise but may have gotten some confused.

  • Security Challenges
    • 89% of organizations experience data breaches, according to new Ponemon Survey
    • Complexity grows risks grow
      • Database consolidation
      • HA
    • Regulatory compliance
      • PCI
      • HIPAA
      • GDPR
      • FERPA
      • Data Protection Act (UK)
      • ….
  • How to Secure
    • Assess
      • Locate Risk and Vulnerabilities
    • Prevent
      • Using Cryptography, User Controls, Access Controls, etc
    • Detect
      • Audit, Monitor, Alert
    • Recover
      • Ensure service is not disrupted as a result of security incident
      • Even through the outage of a primary database
      • Forensics – post mortem – fix vulnerability
  • MySQL Security Solutions
    • Enterprise Edition
      • TDE
      • MySQL Enterprise Authentication
      • MySQL Enterprise Encryption
      • MysQL Enterprise Firewall
      • MySQL Enterprise Audit
      • MysQL Enterprise Monitor
      • MysQL Enterprise Backup
      • MysQL Enterprise Thread pool
    • TDE
      • key ring
      • only tablespace level in 5.7
      • Enterprise Edition uses KMIP for key management
    • Keyring API
      • each key has a name/ACL
      • stored in key store
      • a uniform infrastructure for handling keys
      • usable by both server and plugins
      • Fully extensible
      • minimum effort to add new backends and consumers
      • current consumers
        • InnoDB tablespace encryption
        • SQL user defined functions (UDF) plugin
      • current backends
        • flat file backend
        • KMIP compliant clients
  • New Features
    • MySQL Roles
      • multiple default roles
      • Can export the role graph in GraphML
      • implemented as user account with login disabled
      • new tables mysql.role_edges and mysql.default_roles
    • Atomic ACL Statements
    • Dynamic Privileges
      • provides finer grained administrative level access controls
    • Password Features
      • Password History
      • SHA2 with Caching
    • File Encryption
      • innodb_undo_log_encrypt
      • innodb_redo_log_encrypt

OpenWorld 2017: Split Your Database: Seamless Database Growth with MySQL

Nicolai Plum from Booking.com gave a presentation on how to split a database into multiple databases while minimizing/managing the impact to applications using it.  There is no voodoo magic, but it was a great session covering how to work with the application developers to split a database.

  • Stated with sales DB
    • Customers
    • Transactions
    • Delivery
    • Catalog
    • Inventory
    • Reviews
  • Breaking out the Product information
    • Catalog
    • Inventory
    • Reviews
  • Good idea to change schema name as you go to avoid confusion
  • Why
    • size
    • query time
    • regulatory requirements
    • business/architecture reasons
  • “GDPR best practice is to split private information away from non private information”
  • How to choose tables
    • minimize joins
    • Maximise gains
      • split out bigger, fastest growing, business, most regulated
      • size and activity use: ps-top (utility to help figure this out)
  • Code changes needed
    • may be able to use ProxySQL (has not tried it)
    • try to separate read and write traffic
  • “Everyone has a test environment, some people have a production one as well”
  • Use binlog to help
  • Get rid of joins in updates
  • joins move from sql to app code
  • foreign keys across servers will not work – relational integrity checks move up to app layer
  • uses a translation master to move data between the two schema
    • uses replicate-rewrite-db
    • uses replicate-wild-do-table
  • Steps to create translation master
    • Make a slave, with binlog enabled
    • shut down MySQL
    • configuration: Add
      • skip-slave=start
      • replicate-rewrite-db=’sales=>product’
      • replicate-wild-do-table…..
    • Create new schema
    • Rename all tables that are being split to the new schema
    • Drop old schema from translation master
    • start replication slave
    • Remove skip-slave-start
    • If replication stops with missing table errors, find the bad query, fix up (skip) or make a new translation master
  • Add product schema on sales master to allow product to read and write to the same schema name
    • use sys;
    • call create_synonym_db(‘sales’, ‘product’);
  • Need views in slaves also
  • Now ready to remove translation master and cut over to the new product replication topology
  • Need to fix up grants – they don’t replicate through translation master
    • use different user names to avoid collisions/confusion
  • Now cut the name over – either push config or use DNS CNAME or A record  update
  • Tradeoffs with cut over
    • Disconnect first to have cleanest cut over but with a little downtime
    • Disconnect last to keep application up but with risk of collisions

OpenWord 2017: MySQL 8.0: What’s New in the Optimizer

Session was presented by Manyi Lu and Oystein Grovlen from Oracle and provided a little more depth to topics covered in previous presentations I attended.

  • CTE
    • can precede SELECT/UPDATE/DELETE including sub-queries
    • Can be used to refer to other CTEs (not able to do that with direct tables)
    • Recursive CTEs
      • set cte_max_recursion_depth to limit runaway recursion (default 100)
  • Window Functions
    • OVER (PARTITION BY ……)
  • UTF-8 Support (utf8mb4)
    • utf8mb4_0900_ai_ci default collation
    • performance gains
    • recommend utf8mb4_unicode_520_ci if prior to MySQL 8.0
    • careful with migrated tables can lead to
      • “illegal mix of collation” issues
      • slower queries as index will not be used
  • GIS
    • Geography Support
      • longitude/latitude
    • Spatial Reference Systems
  • JSON
    • Aggregation functions – combine JSON documents in multiple rows into a JSON array
    • Convert JSON documents to relational tables – allows access to sql aggregations and other methods
    • Utility Functions
      • JSON_PRETTY pretty print
      • JSON_STORAGE_SIZE – shows bytes used to store doc
      • JSON_STORAGE_FREE – shows bytes not used
  • Invisible Index
  • Descending indexes
  • Cost Model improvements
    • adapt to SSSD, larger memories, caches
    • cost model takes into account where data is stored
  • Histograms
  • Join Order Hints

OpenWorld 2017: Using MysQL Containers

Matt Lord (Senior MySQL Product Manager at Oracle) gave a good presentation on the state of MySQL with containers.  I had already been to a session on Oracle and containers and there was some overlap, but it was a good presentation on the state of Oracles support of MySQL on Docker.

My sense after attending both of these sessions is that Docker and contains have a large roll in the future delivery of database services.  I also have the sense that we are about 6 months to a year away from those on the bleeding edge having worked out the kinks and it will all be ready for production database workloads.

Notes:

  • What is a Container (already covered in Oracle Docker presentation)
    • Copy on Write layering file system
  • Native Windows Containers – Windows 10 and Windows Server 2016 (similar features to Linux kernel)
  • Makes most sense to match base OS with Container OS, but not required
    • Because less differences so less moved up in filesystem layers
  • Mac and Windows both support running Linux containers but run on top of visualizer (xhyve in mac, hyperv on Windows) – Alpine Linux (built for Docker)
  • Why
    • Package software in a single binary image
    • Everything is repeatable
    • Helps with patching
  • Official MySQL Container Goals
    • Official release product includes a container built on Oracle Linux
    • Container for all products
      • MySQL (NDB) Cluster – preliminary now
      • InnoDB Cluster – not public now
      • Router, Shell, Workbench, Utilities, etc. – future
    • Work with compose, Swarm, Kubernetes
    • Integrate with wider Oracle plans for Docker and Container Services
  • Demo of Docker on Mac
    • Showed vm running Alpine Linux

OpenWorld 2017: The Influential Engineer

Michael Carducci (@MichaelCarducci) gave a great session on how to increase your influence with others.  My notes are:

  • Exercise in Influence
    • Magic trick
  • Real Magic == Applied Psychology
  • 85% of success id human engineering and influence
    • people are irrational
  • Not enough to have a good case
  • 12 tips
    • You act first – Law of Reciprocity (if you do a favor then you get a favor)
      • Absolutely! I know if the situation were reversed you would do the same for me.
      • Of course, I know It’s important that this gets done and I’m happy to help I’m sure you would do the same if situation was reversed
      • Works in negotiations too
      • Always start with the large ask first – Counter request right away to make it received as a concession
    • Begin in a friendly way
    • Get the other person saying yes immediately
    • We are motivated by scarcity
      • What do your solution provide that you can’t get somewhere else
    • Dramatize your ideas – stories resonate in a way facts and figures can’t
    • Consensus
      • People more likely to say yes when they see other people saying yes
      • Listen to other ideas – don’t contradict “actually that turns out not to be the case”
    • Consistency – people live up to what they write down
    • Authority – if an expert says it, then it must be true (you are the expert)
      • Your Expertise comes from
        • Background
        • Experiences
        • Trustworthiness
      • Lead with a weakness in your case – shows knowledge and honesty
    • Be liked
      • similarities
      • compliments
      • cooperative efforts
    • Don’t ague and criticize
    • Try to honestly see things from the other person’s perspectives
    • Admit when your wrong
    • Be a good listener

OpenWorld 2017: MySQL Server 8.0

Morgan Tocker from Oracle gave a good overview of what is new in MySQL 8.0.  Notes:

  • 5.7 improvements
    • Group Replication
    • Performance Schema
    • SYS Schema
    • JSON Support
    • Performance
    • Partitions
    • Trigger Changes
    • MAIN changes
      • GIS support
      • JSON Types
      • Observational Management
  • Premium placed on Time to Market in modern applications
  • MysQL 8
    • GIS
      • Spatial Reference System (SRS) Support
      • Helper functions to manipulate and convert data
        • st_x
        • st_y
        • st_srid
    • JSON
      • a bunch of new functions
      • example showing how to query JSON data out of a JSON doc stored in a column
    • utf8mb4 now default (UNICODE 9)
      •  a lot already were using as default
      • emojis
      • new collations
        • Japanese and Russian
    • Performance
      • showed sysbench read only performance improvements
      • utfmb3 and utfmb4
        • 4 byte more performant so they are deprecating the 3 byte
    • UUID
      • ability to convert to bin and back
      • binary format reorders pieces of the UUID to allow them to follow insert order to avoid splits
    • Hot Row contention handling
      • NOWAIT – Error immediately
      • SKIP LOCKED – only operate on non locked rows (non deterministic)
    • Common Table Expressions
      • WITH t1 AS (SELECT * FROM tblA WHERE a=’b’) SELECT * FROM t1;
      • optimizer know if it needs to use temp table sor if it just needs to rewrite the query on its end
      • Recursive CTEs also supported
    • Analytics Queries with Windows Functions
      • PARTITION BY
    • Optimizer improvements
      • considers how much of an index is in memory vs. on disk
      • Range Scans will be used if all data is in memory
      • can see in info schema how much is in memory (estimated by InnoDB)
      • Histograms support
        • lower cost alternative to an index
        • created on demand and updated manually
        • no regular overhead until told to re-update
        • 2 types
          • equi-height
          • singleton
      • Descending indexes
      • Invisible Indexes (Hidden indexes)
        • Soft deletes
        • Staged Rollouts
        • Still maintained even if not visible
        • only a metadata change to switch visible vs invisible
      • Performance Schema
        • A lot more instrumentation
        • improved defaults so things are on
          • memory
          • transactions
          • metadata – all three on by default in MySQL 8
        • Better indexing on the performance schema – improved performance of querying this data
        • Now can see errors returned to clients from within performance schema
        • Added histograms in addition to min max average to show distribution better (write CTE to visualize)
          • per statement digest basis
        • Window into InnoDB locks – row locks
        • see where variables are sourced from (configuration variables)
      • Information Schema
        • a lot of tables are migrated to views on top of mysql tables to improve performance
        • 30 times faster
      • Consistent Configuration
        • json file that allows you to persist changes to configuration from commands in the instance
      • Crash safe data dictionary
      • New defaults
        • utf8md4
        • new collation
        • performance schema – memory, transactions, MDL on by default
        • InnoDB –
      • Some replication changes
      • Can “set persist_only innodb_dedicated_server = true” – resets settings as systems are sized up and down

OpenWorld 2017: The State of the Dolphin & Customer Experiences

Session on where MySQL is at and what is coming up.

  • MySQL is #1 database for the web
  • 2nd most popular behind Oracle (DB Engines is the source for this)
  • Most popular according to stack overflow
  • MySQL 5.7
    • 3x better performance
    • replication enhancements
    • json
    • improved security
  • MySQL InnoDB Cluster
    • mysql group replication
    • mysql router
    • mysql shell
  • MySQL 8.0
    • Data Dictionary
    • Roles
    • Unicode 9
    • CTEs
    • Window Functions
  • Cloud information about Oracles cloud
    • Freudian slip, as the speaker said cloud would make us more expensive (he meant responsive and quickly correct it)
    • Oracle cloud machine info
      • Has MySQL Cloud service on it
  • Facebook
  • Booking
  • Engineering Updates
    • MySQL 8.0
      • JSON Document Store
      • Improved GIS support
      • utf8mb4 is now default
      • New collation based on DUCET, Japanese and Russian added
      • Improved performance
      • MySQL Shell
        • Customizable prompt – includes context and session info
        • Custom font support
        • Persistent command line
        • DBA Admin API
          • used to set up InnoDB cluster
      • Showed how to use syntax to embed json inside SQL Statements
      • Common Table Expressions
      • Window Functions
      • InnoDB Cluster
        • Router – Routes client to a server online with the correct role
        • Group Replication – “virtually synchronous”
      • Better Handling of Hot Row Contention – NOWAIT and SKIP LOCKED
      • Invisible Indexes
        • Hidden to Optimizer
        • For soft delete of index (recycle bin) and staged rollout
      • Performance Schema
        • Histograms
        • Indexes
        • Data Lock Instrumentation
        • SQL Errors instrumentation
        • Variables instrumentation
        • Table plugin
        • Improved defaults
        • over 30x faster
      • Transactional Data Dictionary
        • Common data dictionary for db and inoodb
        • crash-safe & atomic DDL
        • meta-data locking for FK constraints
        • Scalable Information schema
      • Faster Table/Range Scans
      • Histograms
      • Parallel Replication
      • Improved Cost Model
      • Resource Groups
        • bind operations to cores – i.e updated to a # of cores
    • Most application today are mobile first

OpenWorld 2017: What’s New in MySQL NDB Cluster?

This presentation was given by Bernd Ocklin, Senior Director MySQL Cluster Development at Oracle.  It was a good overview of what NDB Cluster is, where it came from and some recent changes.  My notes are:

  • Trade-Offs: Cellular Network
  • Trade-Offs: Online Games – Massive Parallel
  • 99.999% uptime
  • Distributed real-time in-memory network database
  • Runs standalone or as a storage engine for MySQL
  • Auto-sharding based on key
    • transparent between node groups
    • each fragment active in one Data Node with synchronous replication to 2nd data node
  • DBA chooses what part of Primary key to shared on
  • parallelization works even on single queries
  • batches automatically
  • event-driven and asynchronous
  • Active-Active with Geo-Replication in NDB Cluster
    • Update anywhere
    • conflict detection
    • auto-conflict-resolution
  • Async-Replication is for read scale out
  • MySQL Group Replication – innoDB Cluster
    • scale out writes as well as reads (limited write scale out)
  • Moving towards mixing NDB cluster as a node in InnoDB cluster (not there yet)
  • Synchronous locally and use replication for geo-replication (asynchronous)
  • Updates scale pretty much linearly
  • version 7.6 new features
    • makes joins much faster
    • added ndb import to read in csv files
    • MEM integration
    • MySQL 5.7
    • dynamic resource allocation
    • Partial Local Checkpoint – for performance improvements and scalability (now support much larger databases)
    • Multi-treaded backup
  • in MySQL 8.0 they are moving NDB to base MySQL to avoid forking and placing it as a first class citizen
  • MySQL 8.0 new feature
    • CTEs
    • Window Functions
    • Data Dictionary
    • utf8mb4
    • Security Roles
    • Improved Information Schema Performance

OpenWorld 2017: MySQL Document Store

This was a great session that was centered on the updates to JSON support in MySQL 8.  But to paint the full picture of how MySQL 8 is making JSON a first class data citizen in the platform, it also covered some updates to InnoDB Cluster and the new MySQL Shell.  The session was present by Alfredo Kojima, Miguel Araujo, and Mike Zinner all from Oracle.  Mike is responsible for client tools – shell, workbench, router, connectors, etc.  Alfredo was there to talk about the changes in InnoDB Cluster, and Miguel for the Document Store itself.

These changes make it more possible for you to use MySQL as a NoSQL document store.  With MySQL however you can retain ACID compliance while doing most (if not all) that NoSQL data platforms like MongoDB provide.  It will be interesting to get my hands on this to see what tradeoffs they make to retain the ACID model.  Especially compared to other NoSQL data engines that take the BASE model.

My notes from the session where.

  • MySQL Document Store Intro
    • What is a document?
      • structured data
      • They are using JSON (=JavaScript Object Notation)
      • Others are XML, YAML, etc.
    • Relational Model vs. Document Model
      • Higher Upfront effort during design vs. lower effort upfront
      • Database ensures integrity vs. Application responsible for integrity
      • Schema updates can be problematic vs. Simple schema change
    • Now MySQL supports both relational and document models
    • Exposes new CRUD interface, while keeping SQL
    • Standard plugin for MySQL Server (since MysqL 5.7.12)
    • Store JSON documents in MySQL inside tables (JSON type column)
    • Store JSON outside of relational tables (collection not table)
      • Collections belong to a database/schema just like a table, so you can have collections of the same name in different schemas on the same instance
      • Collections are implemented as tables(with an id and a doc column of type json)
    • Key takeaways
      • INNODB
      • transactions and locking
      • ect…
  • Components of the MySQL Document Store
    • MySQL Shell – new command line interface
    • Works with InnoDB Cluster
    • X Protocol – new protocol to deal with documents (more efficient)
    • Connectors for
      • .NEt
      • Node.JS
      • Python
      • Java
      • etc..
    • Document Store is tightly connected to InnoDB Cluster
      • for scale out
  • InnoDB Cluster
    • basically turning MySQL into a scale out architecture (looks like MongoDB or other NoSQL system architectures to me)
  • The X DevAPI – A modern CRUD App Programming Interface
    • To create a collection – “var products = db.createCollection(‘products’);”
    • Add document – “products.add({“_id”: 123, “name”: “Test1″}).execute();”
    • retrieve/select – “products.find(“price > 10 and price < 100”).fields([“_id”, “name”]).sort([“name”]).execute();
    • change – “products.modify(“_id”: 123  <missed it>
    • delete – products.remove(” <missed it>
  • Node Js example – Async call to MySQL and function (doc) as loopback

var schema = session.getSchema(‘mySchema’);

var coll = schema.getCollection(‘myColl’);

var query = “$.name == :name”;

coll.find(query).bind(‘name’,  ‘Alfredo’).execute(function (doc) {

console.log(doc);

})

.catch(function (err) {

console.log(err.message);

console.log(err.stack);

});

  • Demo of shell interface
    • port 33060 for XDev API port
    • new shell defaults to java script mode
    • can switch to sql with \sql
    • \js to switch back to javascript mode
    • dba.  used to manage the instance with javascript
    • session.sql(“SELECT * FROM TEST1;”).execute(); – for sql in javascript shell
    • session.sql(“select ?”).bind(“‘Hello World'”).execute(); – by using bind it escapes everything for you
    • much more powerful interface even for DBAs
    • scripts require the .execute(), but not needed while using the shell
    • new shell documentation – https://dev.mysql.com/doc/x-devapi-userguide/en/
    • new uuid with timestamp at the end to be more sequential for performance (within a session)
  • Documents via SQL
  • Behind the Scenes