HowTo – Working With Large WordPress Databases

Recently I started a project that I knew would be a rather large one but about half way through it I found that the site would be unmanageable because of the resources that it placed on my server.

The Site was a Church Directory that included over 200,000 churches located throughout the USA. It was not a specific religion or denomination which would have been easier since fewer churches would have been listed but it was an overall listing of all religious “churches” or places of worship.

To provide the site I needed to convert a xml database to csv then use the csv database to build wordpress posts. This is a relatively complex process because I needed to create fields in each row that were based on but not on simple merges of some of the other fields but were a mixture of merges and new information. The resulting fields would be used for building categories and tags, storing addresses and names and other data. When information was not available then placeholder information had to be inserted for later update.

After building the csv database it took a few attempts to import it. Because of its size I ended up splitting the csv into about 20 smaller files using CSVSplitter.exe which you can find online for free.

The import process is time consuming but pretty straight forward. Using a WordPress plugin I would select a file then import it. Each of the 20 files took about 15 minutes and because of the heavy load on mySql and Apache it really eats up all of your computer’s processes.

To import such a large file you will need to adjust your php and apache settings to allow for a large file upload and extend timeouts so the whole file gets processed. I suggest that you set your timeouts well beyond any reasonable amount I think I had mine set to 6 hours.. well this is a development server on my personal computer so nothing else would be running.

After half a day or so of importing the files I was left with all of my posts sorted in categories of State then child City locations. This would allow me to display only the top categories in the sidebar and then the child categories in the center of the page after a state was clicked.

Unfortunately this is where WordPress first displayed its inability to process so many or such a large database. When selecting the State it would take maybe a minute to do the lookups for the cities and then pull back a loop of posts.

I decided that uploading the database to my host may be an option because they do offer some better features. Unfortunately at its largest size the MySQL database was timing out during upload. I was using MySql Administrator for Windows and eventually found a way to split up the 175MB file for upload by separating the tables and also dividing the table into smaller files that could be imported and amended.

I don’t think for a second that WordPress import and export could have handled this file.

Once it was uploaded the site did operate to some extent but it was still impossible to use.

This was not the first large website I made using this method but it seems that I had hit a barrier. Yes possibly the site could be used on a very fast dedicated server but under load may be another thing.

Even with caching enabled the site would still require 200k static html files to be stored. Although this would improve the response for the visitor this is really not an option for hosts that count your resources as not only the amount of data you store but the number of files you store. Again a dedicated and fast server would be required even if the whole site was being fed as html.

So, with all of this in mind I decided that this project was well beyond the limits of WordPress under normal hosting availability.

Having previously built WordPress sites with 20,000 to 50,000 posts and having managed sites with over 100,000 posts and over 700k created in total I was feeling the only way to properly display this information was going to be building my own scripts to call the database directly.

This reduced my MySQL database from 175MB to just about 50MB. The information would be used as directory information and a separate WordPress Blog would be integrated in the main site.

So, this is one instance where a site can be too big for WordPress and your sever to manage.

I eventually decided on only 2 main tables to display the church directory an index and a content table then a third for optional information.

I hope this helps you understand how to examine your limits and find workarounds.

The most important thing when working with large databases is to create backups as you progress.

Then if you hit limitations you may need to split the database into separate tables and the tables into sections with a key that has an auto increment field.

If the data you are using does not have a key then I suggest that you define a new database then use a import after adding a key field.. set it to auto increment and it should line number your rows as they are imported.

This is very helpful for when imports timeout or just crash… then you can verify your data and the last good row imported and begin there.

Again backup .. backup regularly and at any moment that it crosses your mind. If you are working and think for a second I should probably back this up… then you should definitely back it up.

As a final note I have to believe that for a quick site with a lot of information you don’t want your wordpress database to hold over 75,000 posts with a maximum size not over 100MB and this would be maximum. If you are on a great but standard account you may want to reduce that by half and limit your database size to about 50MB.

The workaround for this may be using Multisite or Network Options and forcing wordpress to add tables but probably a better method is to do a new full install on a subdomain and push some of your content into a new database.