gogoWebsite

How to index JSON fields in MySQL

Updated to 36 minutes ago

Reprint address:/articles/303208?utm_content=m_37669

 

Overview

MySQL supports JSON fields since 5.7.8, which greatly enriches the data types of MySQL. It also facilitates developers. But MySQL does not provide the function of indexing fields in JSON objects, at least there is no way to directly index its fields. This article will introduce the use of virtual fields in MySQL 5.7 to index fields in JSON objects.

Sample data

We will demonstrate based on the following JSON object

{
    "id": 1,  
    "name": "Sally",  
    "games_played":{    
       "Battlefield": {
          "weapon": "sniper rifle",
          "rank": "Sergeant V",
          "level": 20
        },                                                                                                                          
       "Crazy Tennis": {
          "won": 4,
          "lost": 1
        },  
       "Puzzler": {
          "time": 7
        }
     }
 }

The basic structure of the table


CREATE TABLE `players` (  
    `id` INT UNSIGNED NOT NULL,
    `player_and_games` JSON NOT NULL,
    PRIMARY KEY (`id`)
);

If we only use the structure of the table above, we cannot index the Key in the JSON field. Next we demonstrate how to index virtual fields

Add virtual fields

The virtual column syntax is as follows

<type> [ GENERATED ALWAYS ] AS ( <expression> ) [ VIRTUAL|STORED ]
[ UNIQUE [KEY] ] [ [PRIMARY] KEY ] [ NOT NULL ] [ COMMENT <text> ]

In MySQL 5.7, two Generated Columns are supported, namely Virtual Generated Column and Stored Generated Column. The former only saves Generated Column in the data dictionary (the metadata of the table) and does not persist this column of data to disk; the latter persists Generated Column to disk, rather than calculates each time it is read. It is obvious that the latter stores data that can be calculated from existing data and requires more disk space. It has no advantage over Virtual Column. Therefore, in MySQL 5.7, the type of Generated Column is not specified, and the default is Virtual Column.

If you need Stored Generated Golumn, it may be more appropriate to create an index on Virtual Generated Column. Generally, Virtual Generated Column is used, which is also the default method of MySQL

The table creation statement after adding virtual columns is as follows:

CREATE TABLE `players` (  
   `id` INT UNSIGNED NOT NULL,
   `player_and_games` JSON NOT NULL,
   `names_virtual` VARCHAR(20) GENERATED ALWAYS AS (`player_and_games` ->> '$.name') NOT NULL, 
   PRIMARY KEY (`id`)
);

Note: Use operator-» to refer to the KEY in the JSON field. In this example, the field names_virtual is a virtual field, and I define it as not null. In actual work, we must determine the specific situation. Because JSON itself is a weak structured data object. In other words, its structure is not fixed.

We insert data

INSERT INTO `players` (`id`, `player_and_games`) VALUES (1, '{  
    "id": 1,  
    "name": "Sally",
    "games_played":{    
       "Battlefield": {
          "weapon": "sniper rifle",
          "rank": "Sergeant V",
          "level": 20
        },                                                                                                                          
       "Crazy Tennis": {
          "won": 4,
          "lost": 1
        },  
       "Puzzler": {
          "time": 7
        }
      }
   }'
);
...

View the data in the table

SELECT * FROM `players`;

+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
| id | player_and_games                                                                                                                                                                                           | names_virtual |
+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
|  1 | {"id": 1, "name": "Sally", "games_played": {"Puzzler": {"time": 7}, "Battlefield": {"rank": "Sergeant V", "level": 20, "weapon": "sniper rifle"}, "Crazy Tennis": {"won": 4, "lost": 1}}}                  | Sally         |
|  2 | {"id": 2, "name": "Thom", "games_played": {"Puzzler": {"time": 25}, "Battlefield": {"rank": "Major General VIII", "level": 127, "weapon": "carbine"}, "Crazy Tennis": {"won": 10, "lost": 30}}}            | Thom          |
|  3 | {"id": 3, "name": "Ali", "games_played": {"Puzzler": {"time": 12}, "Battlefield": {"rank": "First Sergeant II", "level": 37, "weapon": "machine gun"}, "Crazy Tennis": {"won": 30, "lost": 21}}}           | Ali           |
|  4 | {"id": 4, "name": "Alfred", "games_played": {"Puzzler": {"time": 10}, "Battlefield": {"rank": "Chief Warrant Officer Five III", "level": 73, "weapon": "pistol"}, "Crazy Tennis": {"won": 47, "lost": 2}}} | Alfred        |
|  5 | {"id": 5, "name": "Phil", "games_played": {"Puzzler": {"time": 7}, "Battlefield": {"rank": "Lt. Colonel III", "level": 98, "weapon": "assault rifle"}, "Crazy Tennis": {"won": 130, "lost": 75}}}          | Phil          |
|  6 | {"id": 6, "name": "Henry", "games_played": {"Puzzler": {"time": 17}, "Battlefield": {"rank": "Captain II", "level": 87, "weapon": "assault rifle"}, "Crazy Tennis": {"won": 68, "lost": 149}}}             | Henry         |
+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+

View the fields of the table Players

SHOW COLUMNS FROM `players`;

+------------------+------------------+------+-----+---------+-------------------+
| Field            | Type             | Null | Key | Default | Extra             |
+------------------+------------------+------+-----+---------+-------------------+
| id               | int(10) unsigned | NO   | PRI | NULL    |                   |
| player_and_games | json             | NO   |     | NULL    |                   |
| names_virtual    | varchar(20)      | NO   |     | NULL    | VIRTUAL GENERATED |
+------------------+------------------+------+-----+---------+-------------------+

We see that the type of virtual field names_virtual is VIRTUAL GENERATED. MySQL just saves the metadata of the field in the data dictionary and does not really store the value of the field. This way the table size does not increase. We can use the index to physically store the values ​​on this field.

Add index on virtual fields

Before adding the index, let's look at the execution plan of the query below

EXPLAIN SELECT * FROM `players` WHERE `names_virtual` = "Sally"\G  
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: players
   partitions: NULL
         type: ALL
possible_keys: NULL  
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 6
     filtered: 16.67
        Extra: Using where

Add an index

CREATE INDEX `names_idx` ON `players`(`names_virtual`);  

Then execute the above query statement and we will get a different execution plan

EXPLAIN SELECT * FROM `players` WHERE `names_virtual` = "Sally"\G  
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: players
   partitions: NULL
         type: ref
possible_keys: names_idx  
          key: names_idx
      key_len: 22
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

As we can see, the latest execution plan goes through the newly created index.

 

summary

This article describes how to save JSON documents in MySQL 5.7. In order to efficiently retrieve content in JSON, we can use the virtual field of 5.7 to index different KEYs of JSON. Greatly improve the speed of search.