I must say, this came as a shock to me, until I looked at the results and thought about it for a minute. Before jumping in, let me explain how and where I encountered this. I was recently working on a project called Deja Scene – The actor to actor movie database. The purpose of this site is to find two actors who have starred in multiple movies together.
Because the nature of the algorithm an actor can be either on the left side of the connection or the right side.
Let’s look at an example, if I search for George Clooney, I will find that he has 372 Deja Scene connections. Here was the original query I used to retrieve the list of his connections:
[code]
SELECT *
FROM `actor2_actors`
WHERE actor1_id =1
OR actor2_id =1
[/code]
Seems pretty straight forward; however, I was seeing really slow performance and I couldn’t figure out why. I created an index on both the actor1_id and the actor2_id, so I assumed Mysql was properly indexing this. Boy was I wrong. Look at the results when I use the EXPLAIN function:
As you can see, Mysql thinks it can be my key that I created, but it doesn’t use it! No wonder it’s not performing well.
I scratched my head for a bit and thought a UNION query would accomplish the same thing and would be able to better recognize the index.
I proceeded to delete the original index that had both ids together and created two indexes, one for each id.
My query now became:
[code]
EXPLAIN SELECT *
FROM `actor2_actors`
WHERE actor1_id = 1
UNION SELECT *
FROM `actor2_actors`
WHERE actor2_id = 1
[/code]
Now check out the EXPLAIN results, each side of the union is using my indexes, exactly what I want to happen.
After implementing this fix, performance was fixed and my Google graph from webmaster tools is showing me this as well!
Because I was using a framework, the UNION added a bit of development effort because I was going outside the framework, but never let this deter you from seeking the best way to do something!