[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MySQL v4 FTS speed



Lets say I have a table with 4 fields

itemKey INT(10) unsigned auto_increment primary key
status enum("forsale", "sold")
description TEXT
price DECIMAL(10,2)

And I do a

SELECT * FROM table
WHERE MATCH (description) AGAINST ("A really nice toy")
AND status="forsale"
ORDER BY MATCH (description) AGAINST ("A really nice toy")
LIMIT 10

And lets further state that there at 10,000,000 records in the table and 7,500,000 of them are for sale. Lets also say that the minimum word length is set to 3 (so toy is considered significant for the fulltext search) and there is a fulltext index on description.

Lets also assume that there are 1,000,000 records with the word "really" , 1,000,000 records with the word "nice" and 500,000 records with the word "toy". Lets say there are only 500 records with all three words.

Approximately how many records will this query need to look at to return a result? Approximately how long will it take?


First - fulltext engine does NOT looks at records at all for text search,
it builds a list of documents purely from the index. Of course, MySQL
can later retrieve actual rows, based on this list.

So, I'll answer the question "...how many index entries will this query..."

The query, as you wrote it, will look at all the 2,500,000 index entries.


Ok. That is what I thought.




How would the following query compare:

SELECT * FROM table
WHERE MATCH (description) AGAINST ("+A +really +nice +toy")
AND status="forsale"
ORDER BY MATCH (description) AGAINST ("+A +really +nice +toy")
LIMIT 10


In 4.0.1 this query will be absolutely identical to the first one.
As boolean fulltext search was not documented we took a liberty of
changing the syntax slightly. Now it IS documented.


So +, - etc don't do anything unless you switch to boolean mode.



The query

  SELECT * FROM table
  WHERE MATCH (description) AGAINST ("+A +really +nice +toy" IN BOOLEAN MODE)
  ORDER BY MATCH (description) AGAINST ("+A +really +nice +toy" IN BOOLEAN MODE)
  LIMIT 10

will abort the search when it retrieves the 500th document with all the
three words. It's impossible to say, how many index entries it will look
at.


Where does the number 500 come from? Did you mean 10?



The query

  SELECT * FROM table
  WHERE MATCH (description) AGAINST ("+A +really +nice +toy" IN BOOLEAN MODE)
  LIMIT 10

will abort the search when it retrieves the 10th document with all the
three words. It's impossible to say, how many index entries it will look
at.

Regards,
Sergei





---------------------------------------------------------------------
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <mysql-thread92400@xxxxxxxxxxxxxxx>
To unsubscribe, e-mail <mysql-unsubscribe-treed=ultraviolet.org@xxxxxxxxxxxxxxx>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php



This mailing list archive is a service of Copilot Consulting.