Re: full-text search question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 18, 2008 at 02:49:48PM +0200, Sabbiolina wrote:
> www.google.com is only treated as a unique word? Why not producing multiple
> tokens like www.google.com, www, ., google, ., com? (obviously www and . can
> be nulled or stopworded).

You wouldn't want to get the token ".".  It's not a token, but a label
boundary.  So in your analogy of treating the labels in a FQDN as
"words", the "." needs to be treated the way spaces are between words.

A

-- 
Andrew Sullivan
ajs@xxxxxxxxxxxxxxxxx
+1 503 667 4564 x104
http://www.commandprompt.com/


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux