While rewriting the (fluff)Grabber, I had to invent a lot of new things for the new crawling system, which is basically an expanded verifier. Mysql has always been very ambiguous to me because there is very little documentation with useful examples. Someone should setup a php.net like documentation system for mysql, or at least have some examples about what you can do with it from PHP. Anyway, I’m going to share some really useful mysql queries I found recently.
- SELECT COUNT(*) FROM a_table WHERE something = 1; – This is a little talked about query when you start learning mysql since most books and tutorials want to keep you in php. This is basically mysql_num_rows but much faster since it’s asking mysql rather. In addition to that speed, it only keeps a single numeric row in memory instead of counting hundreds of rows. The speed difference isn’t noticeable for many things, but for pagination, it is sometimes.
- (SELECT value FROM `metadata` WHERE metadata.key = “crawl_rate”) – Okay, ignore the table name and the where’s equal value. What I’m showing here is part of another query. In mysql you can fetch queries in other queries. I had an options table with key/value pairs and used this to get the options directly in mysql than in php. This means less PHP code and more speed. Go sub-querieres!
- UPDATE a_table SET a_counter = a_counter + 1 WHERE something = 1; – This one might look useless at first, but often you want to keep track of many times an event happens. You use a counter for that and you need to store it somewhere, not a text file but in the database. Great. You realize that it takes a query to fetch the value of a_counter and then an update query to set it to the new incremented value? This query takes care of all of that in one go. It’s wonderful.
- SELECT * FROM `fluff_books` WHERE DATE_SUB(CURDATE(),INTERVAL (SELECT value FROM `metadata` WHERE metadata.key = “crawl_rate”) DAY) >= last_verified AND deleted = 0 ORDER BY clicks ASC LIMIT 0,1 – This is my favorite. It relies on a sub-query and it also uses some incredible date logic. This query reads out in English, “Select all field from fluff_books where the current time minus 30 days is greater than or equal to last verified and not deleted which are ordered by clicks.” Date logic is a mess to me and probably a lot of others. Basically, you’re subtracting X days (since the number is in my options table) from the current date.
Those are some of the new things I’ve learned while working on (fluff)Grabber.