MySQL insert became fast after adding an index

It actually happened at work, where I had a table which contained almost 2 million records.
I had created an importer which read csv’s from Google Drive and saved the values from these files to records in the table. Which are displayed in a chart on the website.
The importer runs every two hours for an hour (via a cronjob) and had 33k files on Google Drive to go 1 month ago (and every 10 minutes, 3 new files are added), while I had 1 month of vacation to go :slight_smile: so I went to check up on the website yesterday.

What I saw: the charts were incredibly slow (if the page did not throw an exception because it was allocating too much bytes) and the importer had still 27k files to go.
When running the importer manually, I found out that it took 2 minutes :clue: to save 1 file in the database.
So two problems came to my path: slow charts (front end, reading/SELECT) and slow importing (back end, writing/INSERT).
I had prioritized the charts, since that’s what our client sees. So yesterday I made an sqldump from the database on the server for my laptop and I updated a lot of queries for the charts (the client requested some quite complex formules), and added an index for one column. The charts have a much more acceptable loading time now.

So today I was going to do some refactoring since I updated all those queries (Before updating those queries I did a lot of hacking in PHP, long story), and since I am on my PC instead of my laptop today, I went to make an SQL dump again. Shockingly, I saw two files on the server:

  • dump_18-8.sql 134 MB
  • dump_19-8.sql 219 MB

I immediately executed the importer manually and yes, not even 15K of files were left (the importer removes them after they have been read) and it now takes ~2 sec to save one file to the database… :o

How is it possible that an insert-statement takes considerably less time (2 min. to 2 sec) after adding an index to the table? It seems the contrary is not the only truth?

Possibly because of contraints (unique’ness of primary key) the database has to check on each insert.
It can do these checks much faster with an index.
Actually, if you do batch inserts, you should disable constraint checking (if you are certain that your data will not violate them) and after that enable it again. Don’t know if you do this right now.

I would NEVER EVER turn off constrain checking. If you don’t need any constancy in your database then why have constraints in the first place.

Constraint checking is a good part of what a database does.

[quote]In most cases, ALTER TABLE makes a temporary copy of the original table. MySQL waits for other operations that are modifying the table, then proceeds. It incorporates the alteration into the copy, deletes the original table, and renames the new one.
[/quote]
from mysql.com

Maybe a fragmented table got rebuilt into a new one.

Also: InnoDB? MyISAM? Using transactions? Which isolation level? Doing bulk inserts or prepared statements?

There is a far stretch between disabling constraints for batch processing and “not needing any consistency.”
Of course, for data manipulations that originate from user interactions (editing of fields) you would leave constraints enabled, since here you literally and reasonably expect incorrect data.
But it makes perfect sense to disable them for batch processing of automatically recorded data.
I do agree with you that constraint checking is a good part of a database, but only when you need it.

from mysql.com

Maybe a fragmented table got rebuilt into a new one.

Also: InnoDB? MyISAM? Using transactions? Which isolation level? Doing bulk inserts or prepared statements?
[/quote]
How does a table become fragmented? I have only read/inserted records in it.
The table uses InnoDB, I don’t use prepared statements nor do I use transactions or bulk inserts, my bad.
It was meant to just insert 3 records every ten minutes, but since I am a little slow the files were stacked up since May :-* And I have got to admit, didn’t even think of using the prepared statement.

I didn’t use any constraints in the table though, except this auto_incremented Primary Key:
±------------±-------------±-----±----±--------------------±----------------------------+
| Field | Type | Null | Key | Default | Extra |
±------------±-------------±-----±----±--------------------±----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |

Thanks for all of your fast replies. :slight_smile:

Maybe the non-indexed SELECT queries were swamping the server so much that the INSERT queries barely came through…? The SELECT queries would have had to be hammering on the database continuously though… that might be rather unlikely.

Actually, now that you say it Riven, I remember that I still have a check in the importer so the same records don’t get saved twice. I put it in mainly because I did implement the deleting of files later:

$field = FieldValue::find()->where(['updated_at' => $dateOfFile])->one(); //slightly modified

Which is code in PHP Framework Yii2 for the following:

SELECT * FROM field_value WHERE updated_at = :date LIMIT 1

So this topic was actually a big lie… probably… I am sorry guys :persecutioncomplex:

EDIT: Well now I think about it, could this little query really cost 2 minutes?? That’s impossible because then my charts would have never loaded in the first place

And no there were almost no other SELECT queries on the server, the client had access to it, but I am surprised if they have even logged in once a day.

There isn’t enough information for us to answer the questions really. All i can say is that full table scans are slow.

If you want to get to the bottom of this, just drop the index and profile your code.

Could not disagree more. You think the batch job has everything handled correctly. But then when the database is inconstant state afterwards. Your boned. Just don’t do it. Real databases handle constraint checking so fast it should never need to be off unless your doing something wrong. Like non atomic commits. Which you just should never do. Ever.

I have seen companies i have contracted to more or less go under because of sloppy database practices. Turning off constraints is just sloppy.