OpenWorld 2017: InnoDB: What’s new in 8.0

Sunny Bains from Oracle

  • Legacy – Multiple Data Dictionaries (.frm & InnoDB DD)
    • changes not atomic
    • mismatch possible
    • concurrent access had to be managed
    • not crash proof
  • 8.0 is transitional so InnoDB will have mutex for this reason but will go away in future releases
  • New Data Dictionary
    • Required for transactional DDL (future)
    • stated in InnoDB
    • Control meta-data access using single locking mechanism
    • no .frm files for temporary tables – meta-data in memory only
    • Improves table spaces by removing .frm files
    • Stored in new InnoDB system tables
    • single set of persisted metadata for all storage engines
    • makes atomic DDL possible
  • InnoDB serves 2 roles
    • Data Dictionary store for all storage engines
    • implements atomic DDL (DDL_Log)
  • InnoDB gets metadata from the server not from system tables
  • DDL_Log
    • post commit is when physical deletes happen
    • during process the intent to delete is recorded in the DDL_Log
    • stored inside InnoDB
    • no direct user access
  • Performance
    • Cost Based Optimizer statistics
      • Number of pages in RAM per index
    • Remove the buffer pool mutex (Percona contribution)
      • Took a long time to fix problems in the contributed patch
      • QA team found lots of problems in edge cases
      • Foundation for more improvements in the future
    • CATS (Contention Aware Transaction Scheduling)
      • contributed by University of Michigan DB researchers
      • No configuration required
      • Switches between FIFO and CATS automatically
        • threshold is >-= 32 waiting threads
    • group records by table id when purging
      • reduces contention of dict_index_t::lock when multiple purge threads enabled
      • –innodb_deadlock_detect := bool (dynamic)
        • on high concurrent workloads deadlock detector becomes expensive so this turns it off and rely on rollback
      • –innodb_stats_include_delte_marked : bool
        • include/exclude rows marked as deleted
    • memcache improvments
      • support multiple get and range search
    • persistent auto increment
      • doesn’t rest to SELECT MAX(AUTOINC_COL) FROM T; on restart
      • probably the most requested feature since v3.x
      • Bug 199 – created 27 March 2003
    • new INFORMATION SCHEMA tabel INNODB_CACHED_INDEXES
      • pages cached in the InnoDB buffer pool cache
    • Ability to manage Undo table space
      • Add Drop undo table space
      • default of 2 undo table space required
      • undo truncate on by default
    • Avoid intermediate commits that would occur every 10000 rows
    • removed .sl files(InnoDB Symbolic Link files)
    • –innodb-read-only semantics change
      • if ON then affects entire MySQL instance
      • Because DD tables are stored in InnoDB
    • Better tablespace management
      • versioning of tablespaces
        • improves upgrade process
        • support multiple tablespace/page/row foramts
        • helps with maintaining backward compatibility
        • easier to introduce new features/capabilities
          • ie page/row formats that give better compression
    • Serialized Dictionary Information (SDI)
      • Tablespaces self describing
      • json file
      • improved import/export
      • tools for extracting
      • online offline
      • useful during disaster recovery
    • New In-Memory storage engine (temptable)
      • for internal use only
      • not shared across connections
      • lifetime limited to query life time
      • limited size, bound by ram used
        • –temptable-max-ram
    • More flexible BLOB handling
      • allow partial fetch and update
      • plan to make streaming easier
    • encrypt redo and undo logs
    • descending indexes
    • NOWAIT & SKIP_LOCK
    • Dedicated server
      • –innodb-dedicated-server (default OFF)
      • sets default value basedon physical memory available
      • dynamically sets the following variables (UNIX only)
        • –innodb-log-file-size
        • –innodb-buffer-pool-size
        • –innodb-flush-method
    • Scalable redo log
      • Dedicated redo log threads
    • Log buffer can be resized online
    • Scalable IO layer
  • Movement aware of static variables – everything should be dynamic

OpenWorld 2017: MySQL 8.0: What’s New in Security

Mike Frank and Georgi Kodinov from Oracle gave a deep dive into security changes in MySQL 8.0.  A lot of the features are only available in MySQL Enterprise and I tired to keep it clear in my notes which required Enterprise but may have gotten some confused.

  • Security Challenges
    • 89% of organizations experience data breaches, according to new Ponemon Survey
    • Complexity grows risks grow
      • Database consolidation
      • HA
    • Regulatory compliance
      • PCI
      • HIPAA
      • GDPR
      • FERPA
      • Data Protection Act (UK)
      • ….
  • How to Secure
    • Assess
      • Locate Risk and Vulnerabilities
    • Prevent
      • Using Cryptography, User Controls, Access Controls, etc
    • Detect
      • Audit, Monitor, Alert
    • Recover
      • Ensure service is not disrupted as a result of security incident
      • Even through the outage of a primary database
      • Forensics – post mortem – fix vulnerability
  • MySQL Security Solutions
    • Enterprise Edition
      • TDE
      • MySQL Enterprise Authentication
      • MySQL Enterprise Encryption
      • MysQL Enterprise Firewall
      • MySQL Enterprise Audit
      • MysQL Enterprise Monitor
      • MysQL Enterprise Backup
      • MysQL Enterprise Thread pool
    • TDE
      • key ring
      • only tablespace level in 5.7
      • Enterprise Edition uses KMIP for key management
    • Keyring API
      • each key has a name/ACL
      • stored in key store
      • a uniform infrastructure for handling keys
      • usable by both server and plugins
      • Fully extensible
      • minimum effort to add new backends and consumers
      • current consumers
        • InnoDB tablespace encryption
        • SQL user defined functions (UDF) plugin
      • current backends
        • flat file backend
        • KMIP compliant clients
  • New Features
    • MySQL Roles
      • multiple default roles
      • Can export the role graph in GraphML
      • implemented as user account with login disabled
      • new tables mysql.role_edges and mysql.default_roles
    • Atomic ACL Statements
    • Dynamic Privileges
      • provides finer grained administrative level access controls
    • Password Features
      • Password History
      • SHA2 with Caching
    • File Encryption
      • innodb_undo_log_encrypt
      • innodb_redo_log_encrypt

OpenWorld 2017: Split Your Database: Seamless Database Growth with MySQL

Nicolai Plum from Booking.com gave a presentation on how to split a database into multiple databases while minimizing/managing the impact to applications using it.  There is no voodoo magic, but it was a great session covering how to work with the application developers to split a database.

  • Stated with sales DB
    • Customers
    • Transactions
    • Delivery
    • Catalog
    • Inventory
    • Reviews
  • Breaking out the Product information
    • Catalog
    • Inventory
    • Reviews
  • Good idea to change schema name as you go to avoid confusion
  • Why
    • size
    • query time
    • regulatory requirements
    • business/architecture reasons
  • “GDPR best practice is to split private information away from non private information”
  • How to choose tables
    • minimize joins
    • Maximise gains
      • split out bigger, fastest growing, business, most regulated
      • size and activity use: ps-top (utility to help figure this out)
  • Code changes needed
    • may be able to use ProxySQL (has not tried it)
    • try to separate read and write traffic
  • “Everyone has a test environment, some people have a production one as well”
  • Use binlog to help
  • Get rid of joins in updates
  • joins move from sql to app code
  • foreign keys across servers will not work – relational integrity checks move up to app layer
  • uses a translation master to move data between the two schema
    • uses replicate-rewrite-db
    • uses replicate-wild-do-table
  • Steps to create translation master
    • Make a slave, with binlog enabled
    • shut down MySQL
    • configuration: Add
      • skip-slave=start
      • replicate-rewrite-db=’sales=>product’
      • replicate-wild-do-table…..
    • Create new schema
    • Rename all tables that are being split to the new schema
    • Drop old schema from translation master
    • start replication slave
    • Remove skip-slave-start
    • If replication stops with missing table errors, find the bad query, fix up (skip) or make a new translation master
  • Add product schema on sales master to allow product to read and write to the same schema name
    • use sys;
    • call create_synonym_db(‘sales’, ‘product’);
  • Need views in slaves also
  • Now ready to remove translation master and cut over to the new product replication topology
  • Need to fix up grants – they don’t replicate through translation master
    • use different user names to avoid collisions/confusion
  • Now cut the name over – either push config or use DNS CNAME or A record  update
  • Tradeoffs with cut over
    • Disconnect first to have cleanest cut over but with a little downtime
    • Disconnect last to keep application up but with risk of collisions

OpenWord 2017: MySQL 8.0: What’s New in the Optimizer

Session was presented by Manyi Lu and Oystein Grovlen from Oracle and provided a little more depth to topics covered in previous presentations I attended.

  • CTE
    • can precede SELECT/UPDATE/DELETE including sub-queries
    • Can be used to refer to other CTEs (not able to do that with direct tables)
    • Recursive CTEs
      • set cte_max_recursion_depth to limit runaway recursion (default 100)
  • Window Functions
    • OVER (PARTITION BY ……)
  • UTF-8 Support (utf8mb4)
    • utf8mb4_0900_ai_ci default collation
    • performance gains
    • recommend utf8mb4_unicode_520_ci if prior to MySQL 8.0
    • careful with migrated tables can lead to
      • “illegal mix of collation” issues
      • slower queries as index will not be used
  • GIS
    • Geography Support
      • longitude/latitude
    • Spatial Reference Systems
  • JSON
    • Aggregation functions – combine JSON documents in multiple rows into a JSON array
    • Convert JSON documents to relational tables – allows access to sql aggregations and other methods
    • Utility Functions
      • JSON_PRETTY pretty print
      • JSON_STORAGE_SIZE – shows bytes used to store doc
      • JSON_STORAGE_FREE – shows bytes not used
  • Invisible Index
  • Descending indexes
  • Cost Model improvements
    • adapt to SSSD, larger memories, caches
    • cost model takes into account where data is stored
  • Histograms
  • Join Order Hints

OpenWorld 2017: Using MysQL Containers

Matt Lord (Senior MySQL Product Manager at Oracle) gave a good presentation on the state of MySQL with containers.  I had already been to a session on Oracle and containers and there was some overlap, but it was a good presentation on the state of Oracles support of MySQL on Docker.

My sense after attending both of these sessions is that Docker and contains have a large roll in the future delivery of database services.  I also have the sense that we are about 6 months to a year away from those on the bleeding edge having worked out the kinks and it will all be ready for production database workloads.

Notes:

  • What is a Container (already covered in Oracle Docker presentation)
    • Copy on Write layering file system
  • Native Windows Containers – Windows 10 and Windows Server 2016 (similar features to Linux kernel)
  • Makes most sense to match base OS with Container OS, but not required
    • Because less differences so less moved up in filesystem layers
  • Mac and Windows both support running Linux containers but run on top of visualizer (xhyve in mac, hyperv on Windows) – Alpine Linux (built for Docker)
  • Why
    • Package software in a single binary image
    • Everything is repeatable
    • Helps with patching
  • Official MySQL Container Goals
    • Official release product includes a container built on Oracle Linux
    • Container for all products
      • MySQL (NDB) Cluster – preliminary now
      • InnoDB Cluster – not public now
      • Router, Shell, Workbench, Utilities, etc. – future
    • Work with compose, Swarm, Kubernetes
    • Integrate with wider Oracle plans for Docker and Container Services
  • Demo of Docker on Mac
    • Showed vm running Alpine Linux

OpenWorld 2017: MySQL Server 8.0

Morgan Tocker from Oracle gave a good overview of what is new in MySQL 8.0.  Notes:

  • 5.7 improvements
    • Group Replication
    • Performance Schema
    • SYS Schema
    • JSON Support
    • Performance
    • Partitions
    • Trigger Changes
    • MAIN changes
      • GIS support
      • JSON Types
      • Observational Management
  • Premium placed on Time to Market in modern applications
  • MysQL 8
    • GIS
      • Spatial Reference System (SRS) Support
      • Helper functions to manipulate and convert data
        • st_x
        • st_y
        • st_srid
    • JSON
      • a bunch of new functions
      • example showing how to query JSON data out of a JSON doc stored in a column
    • utf8mb4 now default (UNICODE 9)
      •  a lot already were using as default
      • emojis
      • new collations
        • Japanese and Russian
    • Performance
      • showed sysbench read only performance improvements
      • utfmb3 and utfmb4
        • 4 byte more performant so they are deprecating the 3 byte
    • UUID
      • ability to convert to bin and back
      • binary format reorders pieces of the UUID to allow them to follow insert order to avoid splits
    • Hot Row contention handling
      • NOWAIT – Error immediately
      • SKIP LOCKED – only operate on non locked rows (non deterministic)
    • Common Table Expressions
      • WITH t1 AS (SELECT * FROM tblA WHERE a=’b’) SELECT * FROM t1;
      • optimizer know if it needs to use temp table sor if it just needs to rewrite the query on its end
      • Recursive CTEs also supported
    • Analytics Queries with Windows Functions
      • PARTITION BY
    • Optimizer improvements
      • considers how much of an index is in memory vs. on disk
      • Range Scans will be used if all data is in memory
      • can see in info schema how much is in memory (estimated by InnoDB)
      • Histograms support
        • lower cost alternative to an index
        • created on demand and updated manually
        • no regular overhead until told to re-update
        • 2 types
          • equi-height
          • singleton
      • Descending indexes
      • Invisible Indexes (Hidden indexes)
        • Soft deletes
        • Staged Rollouts
        • Still maintained even if not visible
        • only a metadata change to switch visible vs invisible
      • Performance Schema
        • A lot more instrumentation
        • improved defaults so things are on
          • memory
          • transactions
          • metadata – all three on by default in MySQL 8
        • Better indexing on the performance schema – improved performance of querying this data
        • Now can see errors returned to clients from within performance schema
        • Added histograms in addition to min max average to show distribution better (write CTE to visualize)
          • per statement digest basis
        • Window into InnoDB locks – row locks
        • see where variables are sourced from (configuration variables)
      • Information Schema
        • a lot of tables are migrated to views on top of mysql tables to improve performance
        • 30 times faster
      • Consistent Configuration
        • json file that allows you to persist changes to configuration from commands in the instance
      • Crash safe data dictionary
      • New defaults
        • utf8md4
        • new collation
        • performance schema – memory, transactions, MDL on by default
        • InnoDB –
      • Some replication changes
      • Can “set persist_only innodb_dedicated_server = true” – resets settings as systems are sized up and down

OpenWorld 2017: The State of the Dolphin & Customer Experiences

Session on where MySQL is at and what is coming up.

  • MySQL is #1 database for the web
  • 2nd most popular behind Oracle (DB Engines is the source for this)
  • Most popular according to stack overflow
  • MySQL 5.7
    • 3x better performance
    • replication enhancements
    • json
    • improved security
  • MySQL InnoDB Cluster
    • mysql group replication
    • mysql router
    • mysql shell
  • MySQL 8.0
    • Data Dictionary
    • Roles
    • Unicode 9
    • CTEs
    • Window Functions
  • Cloud information about Oracles cloud
    • Freudian slip, as the speaker said cloud would make us more expensive (he meant responsive and quickly correct it)
    • Oracle cloud machine info
      • Has MySQL Cloud service on it
  • Facebook
  • Booking
  • Engineering Updates
    • MySQL 8.0
      • JSON Document Store
      • Improved GIS support
      • utf8mb4 is now default
      • New collation based on DUCET, Japanese and Russian added
      • Improved performance
      • MySQL Shell
        • Customizable prompt – includes context and session info
        • Custom font support
        • Persistent command line
        • DBA Admin API
          • used to set up InnoDB cluster
      • Showed how to use syntax to embed json inside SQL Statements
      • Common Table Expressions
      • Window Functions
      • InnoDB Cluster
        • Router – Routes client to a server online with the correct role
        • Group Replication – “virtually synchronous”
      • Better Handling of Hot Row Contention – NOWAIT and SKIP LOCKED
      • Invisible Indexes
        • Hidden to Optimizer
        • For soft delete of index (recycle bin) and staged rollout
      • Performance Schema
        • Histograms
        • Indexes
        • Data Lock Instrumentation
        • SQL Errors instrumentation
        • Variables instrumentation
        • Table plugin
        • Improved defaults
        • over 30x faster
      • Transactional Data Dictionary
        • Common data dictionary for db and inoodb
        • crash-safe & atomic DDL
        • meta-data locking for FK constraints
        • Scalable Information schema
      • Faster Table/Range Scans
      • Histograms
      • Parallel Replication
      • Improved Cost Model
      • Resource Groups
        • bind operations to cores – i.e updated to a # of cores
    • Most application today are mobile first