Reprint address:/articles/303208?utm_content=m_37669
Overview
MySQL supports JSON fields since 5.7.8, which greatly enriches the data types of MySQL. It also facilitates developers. But MySQL does not provide the function of indexing fields in JSON objects, at least there is no way to directly index its fields. This article will introduce the use of virtual fields in MySQL 5.7 to index fields in JSON objects.
Sample data
We will demonstrate based on the following JSON object
{
"id": 1,
"name": "Sally",
"games_played":{
"Battlefield": {
"weapon": "sniper rifle",
"rank": "Sergeant V",
"level": 20
},
"Crazy Tennis": {
"won": 4,
"lost": 1
},
"Puzzler": {
"time": 7
}
}
}
The basic structure of the table
CREATE TABLE `players` (
`id` INT UNSIGNED NOT NULL,
`player_and_games` JSON NOT NULL,
PRIMARY KEY (`id`)
);
If we only use the structure of the table above, we cannot index the Key in the JSON field. Next we demonstrate how to index virtual fields
Add virtual fields
The virtual column syntax is as follows
<type> [ GENERATED ALWAYS ] AS ( <expression> ) [ VIRTUAL|STORED ]
[ UNIQUE [KEY] ] [ [PRIMARY] KEY ] [ NOT NULL ] [ COMMENT <text> ]
In MySQL 5.7, two Generated Columns are supported, namely Virtual Generated Column and Stored Generated Column. The former only saves Generated Column in the data dictionary (the metadata of the table) and does not persist this column of data to disk; the latter persists Generated Column to disk, rather than calculates each time it is read. It is obvious that the latter stores data that can be calculated from existing data and requires more disk space. It has no advantage over Virtual Column. Therefore, in MySQL 5.7, the type of Generated Column is not specified, and the default is Virtual Column.
If you need Stored Generated Golumn, it may be more appropriate to create an index on Virtual Generated Column. Generally, Virtual Generated Column is used, which is also the default method of MySQL
The table creation statement after adding virtual columns is as follows:
CREATE TABLE `players` (
`id` INT UNSIGNED NOT NULL,
`player_and_games` JSON NOT NULL,
`names_virtual` VARCHAR(20) GENERATED ALWAYS AS (`player_and_games` ->> '$.name') NOT NULL,
PRIMARY KEY (`id`)
);
Note: Use operator-» to refer to the KEY in the JSON field. In this example, the field names_virtual is a virtual field, and I define it as not null. In actual work, we must determine the specific situation. Because JSON itself is a weak structured data object. In other words, its structure is not fixed.
We insert data
INSERT INTO `players` (`id`, `player_and_games`) VALUES (1, '{
"id": 1,
"name": "Sally",
"games_played":{
"Battlefield": {
"weapon": "sniper rifle",
"rank": "Sergeant V",
"level": 20
},
"Crazy Tennis": {
"won": 4,
"lost": 1
},
"Puzzler": {
"time": 7
}
}
}'
);
...
View the data in the table
SELECT * FROM `players`;
+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
| id | player_and_games | names_virtual |
+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
| 1 | {"id": 1, "name": "Sally", "games_played": {"Puzzler": {"time": 7}, "Battlefield": {"rank": "Sergeant V", "level": 20, "weapon": "sniper rifle"}, "Crazy Tennis": {"won": 4, "lost": 1}}} | Sally |
| 2 | {"id": 2, "name": "Thom", "games_played": {"Puzzler": {"time": 25}, "Battlefield": {"rank": "Major General VIII", "level": 127, "weapon": "carbine"}, "Crazy Tennis": {"won": 10, "lost": 30}}} | Thom |
| 3 | {"id": 3, "name": "Ali", "games_played": {"Puzzler": {"time": 12}, "Battlefield": {"rank": "First Sergeant II", "level": 37, "weapon": "machine gun"}, "Crazy Tennis": {"won": 30, "lost": 21}}} | Ali |
| 4 | {"id": 4, "name": "Alfred", "games_played": {"Puzzler": {"time": 10}, "Battlefield": {"rank": "Chief Warrant Officer Five III", "level": 73, "weapon": "pistol"}, "Crazy Tennis": {"won": 47, "lost": 2}}} | Alfred |
| 5 | {"id": 5, "name": "Phil", "games_played": {"Puzzler": {"time": 7}, "Battlefield": {"rank": "Lt. Colonel III", "level": 98, "weapon": "assault rifle"}, "Crazy Tennis": {"won": 130, "lost": 75}}} | Phil |
| 6 | {"id": 6, "name": "Henry", "games_played": {"Puzzler": {"time": 17}, "Battlefield": {"rank": "Captain II", "level": 87, "weapon": "assault rifle"}, "Crazy Tennis": {"won": 68, "lost": 149}}} | Henry |
+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
View the fields of the table Players
SHOW COLUMNS FROM `players`;
+------------------+------------------+------+-----+---------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------------+------+-----+---------+-------------------+
| id | int(10) unsigned | NO | PRI | NULL | |
| player_and_games | json | NO | | NULL | |
| names_virtual | varchar(20) | NO | | NULL | VIRTUAL GENERATED |
+------------------+------------------+------+-----+---------+-------------------+
We see that the type of virtual field names_virtual is VIRTUAL GENERATED. MySQL just saves the metadata of the field in the data dictionary and does not really store the value of the field. This way the table size does not increase. We can use the index to physically store the values on this field.
Add index on virtual fields
Before adding the index, let's look at the execution plan of the query below
EXPLAIN SELECT * FROM `players` WHERE `names_virtual` = "Sally"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: players
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 6
filtered: 16.67
Extra: Using where
Add an index
CREATE INDEX `names_idx` ON `players`(`names_virtual`);
Then execute the above query statement and we will get a different execution plan
EXPLAIN SELECT * FROM `players` WHERE `names_virtual` = "Sally"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: players
partitions: NULL
type: ref
possible_keys: names_idx
key: names_idx
key_len: 22
ref: const
rows: 1
filtered: 100.00
Extra: NULL
As we can see, the latest execution plan goes through the newly created index.
summary
This article describes how to save JSON documents in MySQL 5.7. In order to efficiently retrieve content in JSON, we can use the virtual field of 5.7 to index different KEYs of JSON. Greatly improve the speed of search.