{"id":1549,"date":"2021-08-02T16:26:31","date_gmt":"2021-08-02T19:26:31","guid":{"rendered":"http:\/\/wagnerbianchi.com\/blog\/?p=1549"},"modified":"2021-08-02T16:26:31","modified_gmt":"2021-08-02T19:26:31","slug":"the-mariadb-storage-engine-independent-column-compression","status":"publish","type":"post","link":"http:\/\/wagnerbianchi.com\/blog\/?p=1549","title":{"rendered":"The MariaDB Storage-Engine Independent Column Compression"},"content":{"rendered":"<p>One of the features DBAs have on their sleeves is the compression of the data living on databases. For the MariaDB Server, this is not any different. Among some options, we see available for compressing data and save sometimes a bunch of space, one of them certainly is the\u00a0<a href=\"https:\/\/mariadb.com\/kb\/en\/storage-engine-independent-column-compression\/#comparison-with-innodb-page-compression\">Storage-Engine Independent Column Compression<\/a>, which makes it possible to compress data on the column level.<\/p>\n<p>The motivation to analyse this feature on MariaDB Community Server came by after reading <a href=\"https:\/\/jira.mariadb.org\/browse\/MDEV-22367\">MDEV-22367<\/a>,\u00a0which claims that MariaDB should retain tables created with the InnoDB having the <strong>ROW_FORMAT=COMPRESSED<\/strong>\u00a0as read-only by default, as mentioned on the notable changes for the <a href=\"https:\/\/mariadb.com\/kb\/en\/mariadb-1060-release-notes\/#innodb\">MariaDB 10.6<\/a> (InnoDB). So, the Colum Compression appears to be, at the initial moment, an alternative if you want to make the compression a little more granular instead of running your databases with the\u00a0<a href=\"https:\/\/mariadb.com\/kb\/en\/innodb-system-variables\/#innodb_read_only_compressed\">innodb_read_only_compressed<\/a> as OFF.<\/p>\n<p style=\"text-align: center;\"><strong>You must understand that compressing the whole table with the InnoDb ROW_FORMAT as COMPRESSED is different from compressing the columns of a table with the Storage-Engine Independent Column Compression.<\/strong><\/p>\n<p>So, the\u00a0Storage-Engine Independent Column Compression will help compress\u00a0columns of tables\u00a0of one of the following data types:\u00a0TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB, TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT, VARCHAR, and VARBINARY. You only need to worry about adding the COMPRESSED work to the columns, initially,\u00a0to get it compressed. I would also like to call the attention that you&#8217;re pretty much covered if you are using the <a href=\"https:\/\/mariadb.com\/kb\/en\/json-data-type\/\">JSON<\/a> data type added to the MariaDB Server at its version 10.2.7. The JSON data type is mapped out to the\u00a0<a href=\"https:\/\/mariadb.com\/kb\/en\/longtext\/\">LONGTEXT<\/a> data type mentioned previously.<\/p>\n<p>Before we start creating tables and adding compression, let&#8217;s see <a href=\"https:\/\/mariadb.com\/kb\/en\/storage-engine-independent-column-compression\/#new-system-variables\">system<\/a> and <a href=\"https:\/\/mariadb.com\/kb\/en\/storage-engine-independent-column-compression\/#new-status-variables\">status<\/a> variables available:<\/p>\n<pre>MariaDB [(none)]&gt; show global variables where variable_name in ('column_compression_threshold','column_compression_zlib_level','column_compression_zlib_strategy','column_compression_zlib_wrap');\r\n+----------------------------------+------------------+\r\n| Variable_name                    | Value            |\r\n+----------------------------------+------------------+\r\n| column_compression_threshold     | 100              |\r\n| column_compression_zlib_level    | 6                |\r\n| column_compression_zlib_strategy | DEFAULT_STRATEGY |\r\n| column_compression_zlib_wrap     | OFF              |\r\n+----------------------------------+------------------+\r\n4 rows in set (0.003 sec)\r\n\r\nMariaDB [(none)]&gt; show global status where variable_name in ('Column_compressions','Column_decompressions');\r\n+-----------------------+-------+\r\n| Variable_name         | Value |\r\n+-----------------------+-------+\r\n| Column_compressions   |\u00a0 0    |\r\n| Column_decompressions |\u00a0 0    |\r\n+-----------------------+-------+\r\n2 rows in set (0.001 sec)<\/pre>\n<p>OK, having said that, let&#8217;s create a simple table having three columns and\u00a0one of them that we are supposed to add compression soon issuing an\u00a0<strong>ALTER\u00a0TABLE<\/strong> command. All the exercises here will be done using the MariaDB Community Server\u00a010.6.3 on Ubuntu 20.04.2 LTS (mariadb:latest docker image), but the Columnar Compression sed here is supported since MariaDB Server 10.3.2.<\/p>\n<pre>CREATE TABLE `t1` (\r\n`a` int(11) NOT NULL AUTO_INCREMENT,\r\n`b` varchar(255) DEFAULT NULL,\r\n`c` blob DEFAULT NULL,\r\nPRIMARY KEY (`a`)\r\n) ENGINE=InnoDB AUTO_INCREMENT=84525 DEFAULT CHARSET=utf8mb4;<\/pre>\n<p>Let&#8217;s add some rows to the table:<\/p>\n<pre>root@e7bc0381525d:\/# for i in {01..120176}; do mariadb -e 'INSERT INTO test.t1 set a=null, b=REPEAT('b',255), c=REPEAT('c', 65535);'; done\r\nroot@e7bc0381525d:\/#<\/pre>\n<p>Let&#8217;s check the size of the table t1 tablespace:<\/p>\n<pre>root@e7bc0381525d:\/# mariadb -e 'select count(*) from test.t1'\r\n+----------+\r\n| count(*) |\r\n+----------+\r\n| 120176 |\r\n+----------+\r\nroot@e7bc0381525d:\/# ls -lh \/var\/lib\/mysql\/test\r\ntotal 12M\r\n-rw-rw---- 1 mysql mysql 67 Aug 2 15:32 db.opt\r\n-rw-rw---- 1 mysql mysql 2.0K Aug 2 17:39 t1.frm\r\n-rw-rw---- 1 mysql mysql 11M Aug 2 18:52 t1.ibd<\/pre>\n<p>Let&#8217;s add compression to columns b and c:<\/p>\n<pre>root@e7bc0381525d:\/# mariadb -e 'alter table test.t1 change b b varchar(255) compressed, change c c blob compressed;'\r\n\r\nroot@e7bc0381525d:\/# mariadb -e 'show create table test.t1'\r\nCREATE TABLE `t1` (\r\n `a` int(11) NOT NULL AUTO_INCREMENT,\r\n `b` varchar(255) \/*!100301 COMPRESSED*\/ DEFAULT NULL,\r\n `c` blob \/*!100301 COMPRESSED*\/ DEFAULT NULL,\r\n PRIMARY KEY (`a`)\r\n) ENGINE=InnoDB AUTO_INCREMENT=120620 DEFAULT CHARSET=utf8mb4;\r\nroot@e7bc0381525d:\/# ls -lh \/var\/lib\/mysql\/test\r\ntotal 4.2M\r\n-rw-rw---- 1 mysql mysql 67 Aug 2 15:32 db.opt\r\n-rw-rw---- 1 mysql mysql 2.0K Aug 2 19:00 t1.frm\r\n-rw-rw---- 1 mysql mysql 4.0M Aug 2 19:00 t1.ibd<\/pre>\n<p>Let&#8217;s check our status variables:<\/p>\n<pre>MariaDB [test]&gt; show global status where variable_name in ('Column_compressions','Column_decompressions');\r\n+-----------------------+-------+\r\n| Variable_name         | Value |\r\n+-----------------------+-------+\r\n| Column_compressions   | 22    |\r\n| Column_decompressions | 22    |\r\n+-----------------------+-------+\r\n2 rows in set (0.001 sec)<\/pre>\n<p>So, from 11MB to 4 MB, we&#8217;re talking about a compression rate of ~63%. The remaining question here is that this rate can vary considering the size of the tablespace? Maybe you can share your experience by adding your comment &#8211; any comments are really welcome.<\/p>\n<p><strong>Attention:<\/strong><\/p>\n<p>This blog post isn&#8217;t to encourage or discourage anything; this is meant to mainly exercise the column compression, which appears to be a good feature and must be more used to offer insights for improving it.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the features DBAs have on their sleeves is the compression of the data living on databases. For the MariaDB Server, this is not any different. Among some options, we see available for compressing data and save sometimes a bunch of space, one of them certainly is the\u00a0Storage-Engine Independent Column Compression, which makes it [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1549"}],"collection":[{"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1549"}],"version-history":[{"count":6,"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1549\/revisions"}],"predecessor-version":[{"id":1555,"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1549\/revisions\/1555"}],"wp:attachment":[{"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1549"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1549"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/wagnerbianchi.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1549"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}