2007.12.12 11:42 PM
Google Does Numeric HTML Entities
Ran into an interesting Google feature today while researching some obscure entities found in some scraps of HTML that I had to scrub, store, and index. Maybe everyone knows this, but it was new to me.
If a Google search contains a numeric HTML entity, in the form &#xxx;, Google will convert it to its proper value. So, for instance, if you submit a search for "Air á Danser", it will return "Air á Danser" and perform the expected search. It will not do the same thing for the equivalent named entity reference "Air á Danser".
So when faced with an unfamiliar numeric entity, like ∞ or ℵ, finding out what it looks like is as easy as a Google search:
Lest one think this is a mere byproduct of a web page taking a value in a POST and returning it as the value of a text input element, consider that neither of the other two major search engines provides this feature:
Google is clearly evaluating the numeric entity, converting it to its proper character, and subsequently using the character in its search. Nice touch.
TrackBack URL: http://www.typepad.com/services/trackback/6a00d8341c7bd453ef00e54fb5a3ab8834
Listed below are links to weblogs that reference Google Does Numeric HTML Entities: