Posts tagged ‘Bing’


Rising popularity on Autocade

07.08.2022

Ever since we had to reset the counter for Autocade in March, because of a new server and a new version of Mediawiki, it’s been interesting to see which pages are most popular.

The old ranking took into account everything from March 2008 to March 2022. With everything set to zero again, I can now see what’s been most popular in the last few months.

Some of the top 20 were among the top pages before March 2022, but what’s surprising is what’s shot up into the top slots.

Over the course of half a day on Friday GMT, the Toyota Corolla (E210) page found itself as the top page, home page excepting. And the Kia Morning (TA) page shot up out of nowhere recently, too.

I know our page on the Corolla is number one on Mojeek for a search of that model but that can’t be the only reason it’s done so well. I haven’t studied the referrer data. A shame that link: no longer works on search engines.
 

 

Corolla fans, thank you for your extra 6,000 page views! It’s helped our overall total, but the viewing rate is still down at 2019 levels thanks to the collapse of the Bing index, and the search engines that it’s taken down with them.

I almost feel I’ve shot myself in the foot for promoting Duck Duck Go so much since 2010! But then I hopefully spared a lot of people from being tracked (as much) by the big G.

Tags: , , , , , , , ,
Posted in cars, interests, internet, media, New Zealand, publishing, technology | No Comments »


Mojeek shows more in its search results than Google

02.08.2022

This was something I had forgotten when doing the numbers on how many pages each search engine had indexed from our sites: what they claim to be their index size and what they let you access are two different things.

And in Lucire’s case, Google, curiously, mostly does not allow access to our dynamic pages in PHP in its main index, reserving them for Google News. Google News, however, has both PHP and HTML. It’s only when you feed in a specific request for one of our stories that we know is on a PHP-generated page that it comes up in the main index’s results.

Let me explain. Remember this from a blog post in July? These are what the search engines said they had indexed for lucire.com (in a site:lucire.com search). I’ve updated it for August 2 and added one more search engine, Yep, another independent, out of interest.
 
Google: 10,600
Mojeek: 3,593
Duck Duck Go: 50
Brave: 19
Bing: 10
Yep: 10
 

But can you see 10,600? Here’s the reality of what is truly visible at the moment when you browse the results’ pages of each search engine as of today:
 
Google: 304
Mojeek: 1,000
Duck Duck Go: 50
Brave: 19
Bing: 10
Yep: 10
 


Above: Google (top) shows fewer pages than Mojeek in a site: search.
 

Mojeek maxes out at 1,000 by design, but like Google, it will find a specific article outside of the 1,000 shown if searched for. Google conks out at 304 (303 when I first did this test).

The bigger Google index is its advantage, but Mojeek does a fine job by sharing more in its results’ pages than Google does—over three times as many. Another win for the plucky independent out of the UK.
 
While we’re on the subject, notice how small the Bing index is getting, returning just 10 pages for lucire.com? It’s really collapsed in a big way. Feeding in the other sites I tested earlier, Bing shows declines all round, apart from Travel & Leisure.

Fancy having only 2,723 results from The New York Times, down from 1,190,000 on the 24th ult. Mojeek has over 1,000 times more than Bing, and Google over 12 times more than that.

Previous numbers in parentheses below.
 
Die Zeit
Google: 2,710,000 (2,600,000)
Mojeek: 4,891 (4,796)
Bing: 3,268 (3,770)
 
Annabelle (Switzerland)
Google: 11,900 (11,700)
Mojeek: 408 (405)
Bing: 26 (105)
 
Holly Jahangiri
Google: 618 (738)
Mojeek: 236 (222)
Bing: 10 (49)
 
The Gloss (Ireland)
Google: 17,600 (19,200)
Mojeek: 2,009 (1,968)
Bing: 20 (71)
 
The New York Times
Google: 36,500,000 (36,200,000)
Mojeek: 2,879,513 (2,823,329)
Bing: 2,723 (1,190,000)
 
Lucire
Google: 10,600 (6,050)
Mojeek: 3,593 (3,572)
Bing: 10 (50)
 
The Rake
Google: 11,100 (11,500)
Mojeek: 1,445 (1,443)
Bing: 16, but claims 4! (49)
 

 
Travel & Leisure
Google: 33,500 (28,100)
Mojeek: 10,081 (9,750)
Bing: 383 (220)
 
Microsoft
Google: 118,000,000 (122,000,000)
Bing: 1,927,118 (14,200,000)
Mojeek: 1,772,165 (1,748,199)
 
Detective Marketing
Google: 961 (998)
Mojeek: 579 (579)
Bing: 16 (51)

Tags: , , , , , , , ,
Posted in internet, technology, UK, USA | No Comments »


We’ve reached 4,600 models on Autocade

02.08.2022


 
We’ve hit 4,600 models on Autocade, with the Toyota Will VS taking us to this point, but the stats show we are sitting on 1,180,548 views. We have to get to 1,352,989 on the new count before I can announce we’ve reached 29 million page views.

We’re looking at the lowest traffic on Autocade since 2019, and I’m sure the collapse of the Bing index, taking down the indices of all associated search engines (Duck Duck Go, Qwant, etc.), is to blame. I used to see an increase of 100,000 every week, roughly, but not these days. (PS.: I was still observing this level when we first switched the site over, and the slower growth has probably coincided with when WorldWideWebSize.com recorded Bing’s plummet in late May–early June.)

Autocade is the one site where we never changed the set-up, other than hosting provider and Mediawiki version. The other sites had various things done to them, with Cloudflare and HTTPS. So given the “invisible” changes—changes we had done before in years gone by—we know “it’s not us, it’s them”.

I’ve listed the three Will models (or WiLL to use the original styling) as Toyotas after I confirmed this with another motorhead, the very knowledgeable Atsuhiro Takeda. They were also always listed as Toyotas by Auto Katalog many years ago, and I believe also by Toutes les voitures du monde. Atsu confirmed that that was how he believed they should be indexed. I’ve had those Will publicity images for a long time and it’s nice they’ve finally gone online in Autocade.

The only oddity in the Autocade stats is the rise in hits for our page on the Kia Morning (TA), coming from nowhere and into sixth place among model pages. Whomever the Morning fans are, I thank you!

Tags: , , , , , , , , , ,
Posted in cars, internet, publishing | No Comments »


Bing has tanked

24.07.2022

Well, folks, here’s someone who’s done the maths. The stats in the last post suggested as much but the sample was so small.

Maurice de Kunder at WorldWideWebSize.com has a definitive graph:
 

 

His methodology is explained at his site.

I’d say late May or early June was when I noticed Duck Duck Go queries on Lucire become largely useless. After a month of seeing no improvement, I began looking into alternatives.

No one knows why, since Bing’s not going to admit any of this. If I was Duck Duck Go, I’d be looking into alternatives smartly. Anyone want to get in touch with Alltheweb and Inktomi? Their indices in the early 2000s were bigger than this.
 
PS.: I tried to tell the SEO sub-Reddit, but no joy. It was immediately removed.
 

 
The original text:

Since June I noticed that our internal site:domain.com searches powered by Duck Duck Go were not returning many results any more. As DDG is powered by Bing, I checked it out there, and, sure enough, we dipped from thousands of entries to 50 (and even 10 at one point). This is a 25-year-old site with decent inbound links.

I did a lot of investigating which I wrote up on my own blog (which I won’t link here due to sub-Reddit rules) and came across this website, which seems to suggest Bing has tanked. The person who runs it is pretty clued up on statistics.

I have run a small sample of 10 sites through the search engines as well and these back up their findings.

At this rate, Bing is smaller than Inktomi and Alltheweb in the early 2000s. What strikes me as weird is that all the Bing licensees haven’t done anything, either, so Duck Duck Go, Ecosia, Qwant, and Onesearch have all shrunk, too. (Swisscows is still reasonably sized.)

Anyone else been through something similar in the last two months?

Why don’t they wish to know? I would have thought this was rather serious for an SEO group.

Tags: , , , , , , , , , ,
Posted in internet, technology | 6 Comments »


Putting the search engines through their paces

24.07.2022

One more, and I might give the subject a rest. Here I test the search engines for the term Lucire. This paints quite a different picture.

Lucire is an established site, dating from 1997, indexed by all major search engines from the start. The word did not exist online till the site began. It does exist in old Romanian. There is a (not oft-used) Spanish conjugated verb, I believe, spelt the same.

The original site is very well linked online, as you might expect after 25 years. You would normally expect, given its age and the inbound links, to see lucire.com at the top of any index.

There is a Dr Yolande Lucire in Australia whom I know, who I’m used to seeing in the search engine results.

The scores are simply for getting relevant sites to us into the top 10, and no judgement is made about their quality or relevance.
 
Google
lucire.com
twitter.com
lucire.net
instagram.com
wikipedia.org
linkedin.com
facebook.com
pinterest.nz
neighbourly.co.nz
—I hate to say it, as someone who dislikes Google, but all of the top 10 results are relevant. Fair play. Then again, with the milliards it has, and with this as its original product, it should do well. 10/10
 
Mojeek
scopalto.com
lucirerouge.com
lucire.net
lucire.com
mujerhoy.com
portalfeminino.com
paperblog.com
dailymotion.com
eldiablovistedezara.net
hispanaglobal.com
Mojeek might be flavour of the month for me, but these results are disappointing. Scopalto retails Lucire in France, so that’s fair enough, but disappointing to see the original lucire.com site in fourth. Fifth, sixth, seventh, ninth and tenth are irrelevant and relate to the Spanish word lucir. You’d have to get to no. 25 to see Lucire again, for Yola’s website. Then it’s more lucir results till no. 52, the personal website of one of our editors. 5/10
 
Swisscows
lucire.net
wikipedia.org
lucire.com
spanishdict.com
lucire.net
lucire.com
drlucire.com
facebook.com
spanishdict.com
viyeshierelucre.com
—Considering it sources from Bing, it makes the same mistakes by placing the rarely linked lucire.net up top, and lucire.com in third. Fourth, ninth and tenth are irrelevant, and the last two relate to different words. Yola’s site is seventh, which is fair enough. 6/10
 
Baidu
lucire.net
lucire.com
lucire.cc
lucire.com
kanguowai.com
hhlink.com
vocapp.com
forvo.com
kuwo.cn
lucirehome.com
—Interesting mixture here. Strange, too, that lucire.net comes up top. We own lucire.cc but it’s now a forwarding domain (it was once our link shortener, up to a decade ago). Seventh and ninth relate to the Romanian word strălucire and eighth to the Romanian word lucire. The tenth domain is an old one, succeeded a couple of years ago by lucirerouge.com. Not very current, then. 7/10
 
Startpage
lucire.com
lucire.com
lucire.net
instagram.com
wikipedia.org
linkedin.com
facebook.com
pinterest.nz
fashionmodeldirectory.com
twitter.com
—All relevant, as expected, since it’s all sourced from Google. 10/10
 
Virtual Mirage
lucire.com
instagram.com
wikipedia.org
lucire.net
facebook.com
linkedin.com
pinterest.nz
lucirerouge.com
nih.gov
twitter.com
—I don’t know much about this search engine, since I only heard about it from Holly Jahangiri earlier today. A very good effort, with only the ninth one being irrelevant to us: it’s a paper co-written by Yola. 9/10
 
Yandex
lucire.com
lucire.net
facebook.com
twitter.com
wikipedia.org
instagram.com
wikipedia.eu
pinterest.nz
en-academic.com
wikiru.wiki
—This is the Russian version. All are relevant, and they are fairly expected, other than the ninth result which I’ve not come across this high before, although it still relates to Lucire. 10/10
 
Bing
lucire.net
wikipedia.org
lucire.com
spanishdict.com
lucire.com
facebook.com
drlucire.com
spanishdict.com
twitter.com
lucirahealth.com
—How Bing has slipped. There are sites here relating to the Spanish word lucirse and to Lucira, who makes PCR tests for COVID-19. One is for Yola. 7/10
 
Qwant.com
lucire.net
wikipedia.org
spanishdict.com
drlucire.com
spanishdict.com
tumblr.com
lucirahealth.com
lacire.co
amazon.com
lucirahealth.com
—For a Bing-licensed site, this is even worse. No surprise to see lucire.com gone here, given how inconsistently Bing has treated it of late. But there are results here for Lucira and a company called La Cire. The Amazon link is also for Lucira. 3/10
 
Qwant.fr
lucire.net
wikipedia.org
reverso.net
luciremen.com
lucire.com
twitter.com
lacire.co
lucirahealth.com
viyeshierelucre.com
lucirahealth.com
—The sites change slightly if you use the search box at qwant.fr. The Reverso page is for the Spanish word luciré. Sixth through tenth are irrelevant and do not even relate to the search term. Eleventh and twelfth are for lucire.com and facebook.com, so there were more relevant pages to come. The ranking or relevant results, then, leaves something to be desired. 5/10
 
Duck Duck Go
lucire.com
lucire.net
wikipedia.org
spanishdict.com
drlucire.com
spanishdict.com
lucirahealth.com
amazon.com
lacire.co
luciremen.com
—Well, at least the Duck puts lucire.com up top, and the home page at that (even if Bing can’t). Only four relevant results, with Lucire Men coming in at tenth. 4/10
 
Brave
lucire.com
instagram.com
twitter.com
wikipedia.org
linkedin.com
lucire.net
facebook.com
fashion.net
wiktionary.org
nsw.gov.au
—For the new entrant, not a bad start. Shame about the smaller index size. All of these relate to us except the last two, one a dictionary and the other referring to Yolande Lucire. 8/10
 

The results are surprising from these first results’ pages.
 
★★★★★★★★★★ Google
★★★★★★★★★★ Yandex
★★★★★★★★★★ Startpage
★★★★★★★★★☆ Virtual Mirage
★★★★★★★★☆☆ Brave
★★★★★★★☆☆☆ Baidu
★★★★★★★☆☆☆ Bing
★★★★★★☆☆☆☆ Swisscows
★★★★★☆☆☆☆☆ Mojeek
★★★★★☆☆☆☆☆ Qwant.fr
★★★★☆☆☆☆☆☆ Duck Duck Go
★★★☆☆☆☆☆☆☆ Qwant.com
 

It doesn’t change my mind about the suitability of Mojeek for internal searches though. It’s still the one with the largest index aside from Google, and it doesn’t track you.

Tags: , , , , , , , , , , , , , , , , , , , ,
Posted in China, France, internet, publishing, technology, UK, USA | 2 Comments »


Forget Duck Duck Go, Bing, and Google—I’m trying Mojeek

17.07.2022

It was disappointing to note that after switching to HTTPS, and signing on to Bing Webmaster Tools, the search engine results for those sites of ours that made the change are still severely compromised.

I’ve written about searches for my own name earlier, where my personal and company sites lost their first and second positions on all search engines that I knew of after we made the switch. Only Google has my personal site back up top, with the company site on the middle of the second page. Bing has my personal site at number two, and I’d love to tell you where the company site is, but their search engine results’ pages won’t let me advance beyond page 2 (clicking ‘next page’ lands you back on the same page; clicking ‘3’ and above still keeps you on p. 2). Duck Duck Go, which uses Bing results, has it well below that—I gave up looking. And this is after I signed up to Bing Webmaster Tools in the hope I could get the sites properly catalogued.

It’s a real shame because Duck Duck Go has been my default for 12 years this August.

However, it was the loss of search results for Lucire that really bothered me. Here’s a site that’s 25 years old, with plenty of inward links, and c. 5,000 pages. Before the switch to HTTPS, the popular search engines had thousands of pages from our site. These days, Bing and Duck Duck Go tell me they have dozens of pages from Lucire’s website. Again, only Google seems to have spidered everything.

When I check Bing Webmaster Tools, the spidering has been shockingly poor.

The received wisdom that you should have HTTPS instead of HTTP to do better in search engines is BS, and the belief that search engines will eventually catch up has also not been realized. We made the switch in March, and I’m to believe that Bing hasn’t completed the indexing of our sites.

Are they using the same computers New Zealand banks do? (Cheques used to clear overnight in the 1970s, and now banks tell us that even electronic payments can take days. When we last used cheques, they were telling us they would take five to seven days. Ergo, bank computers are slower today than in 1976.)

The real downer is that Lucire’s website search box is powered by Duck Duck Go, so our own site visitors can’t find the things they want to look for. If you believe some of the search engine marketing, over 40 per cent of site visitors use your search function.

What to do?

I began looking at having an internal search again. We used to have a WhatUSeek (later SiteLevel) internal site search, but that site’s search functions appear to be dead (the site is still live). A user on Mastodon recommended Sphinx Search, an open-source internal site search, but the instructions were too complex. I even saw real computer geeks having trouble. The only one that I could understand was called Sphider—I could follow the instructions and knew enough about PHP and MySql—but it was last updated many years ago, and successive projects also looked a bit complex.

A Google internal search was absolutely out of the question, as I have no desire to expose our readers to tracking—which is why so many other Big Tech gadgets have been removed from our site(s). Baidu and Yandex also have very limited indices for our sites.

I am very fortunate to have tried Mojeek again, a British search engine recommended to me by Matias on July 2. What I didn’t know then was Mojeek has its own spider and its own index, so it doesn’t have to license anything from Bing. And, happily, it claims to have 3,535 results from lucire.com, which might not be as good as Google’s 5,830, but it beats Bing’s 50 earlier today—in fact, at the time of writing, it showed a grand total of 10. That’s how bad it’s got. Duck Duck Go now has 48, also down from a few thousand before March.

Like Google, it seems to have coped with the switch to HTTPS without falling to pieces! And guess what? For a search of my own name, my personal site is number one, and our work site is number two. Presumably, Mojeek is the only search engine which coped and behaved exactly as the experts said!

You can imagine my next move. Mojeek has a site search, so now all Lucire searches are done through it. And readers can actually find stuff again instead of coming up nearly empty (or having very irrelevant results) as they have done for months.

Duck Duck Go’s lustre had been wearing off as there were recent allegations that its browser allowed Microsoft to track its users, something which Duck Duck Go boss Gabriel Weinberg personally denied on Reddit, saying that users were still anonymous when loading their search results.

I still have good memories of chatting to Gabriel in the early days and figuring out ways of spreading the word on Duck Duck Go. My contribution was going to hotels and changing the search defaults on business centre computers. Back then I had the impression Duck Duck Go did some of its own spidering, but these days, if Bing has a shitty index for your site, the Duck will follow suit. And with HTTPS not living up to its promise, that’s simply not good enough.

Tonight, Mojeek is very much the site of the day here, and I heartily recommend you try it out. I’ve switched the desktop to Mojeek as a default, and I’ll see how it all progresses. Right now I feel it deserves our support more than Duck Duck Go. Finally, we might truly have an alternative to Google, and it’s run from the UK’s greenest data centre. With our servers now being greener, too, running out of Finland, the technology is starting to match up to our beliefs.
 

Google, the biggest index of them all
 

Mojeek, a creditable second place
 

This is it on Bing: a 25-year-old history on the web, and it says it has 10 pages from lucire.com. Altavista, Excite and Hotbot had more in the 1990s
 

Duck Duck Go is slightly better, with 48 results—down from the thousands it once had
 
After switching to HTTPS
Number of results for lucire.com
Google: 5,830
Mojeek: 3,535 (containing the word Lucire, as term-less searches are not allowed)
Duck Duck Go: 48
Bing: 10
 
Number of results for jackyan.com
Google: 878
Mojeek: 437 (containing the term “Jack Yan”)
Duck Duck Go: 54
Bing: 24
 
Number of results for jyanet.com
Google: 635
Mojeek: 297 (containing the word jyanet)
Duck Duck Go: 46
Bing: 10
 

Presumably the only search engine that could handle a server going from HTTP to HTTPS and preserving the domains’ positions

Tags: , , , , , , , , , , , , , , , , ,
Posted in business, internet, publishing, technology, UK | 1 Comment »


Bing Webmaster Tools: how to make sure you vanish from a search engine completely

03.06.2022

With my personal site and company site—both once numbers one and two for a search for my name—having disappeared from Bing and others since we switched to HTTPS, I decided I would relent and sign up to Bing Webmaster Tools. Surely, like Google Webmaster Tools, this would make sure that a site was spidered and we’d see some stats?

Once again, the opposite to conventional internet wisdom occurred. Both sites disappeared from Bing altogether.

I even went and shortened the titles in the meta tags, so that this site is now a boring (and a bit tossy) ‘Jack Yan—official site’, and the business is just ‘Jack Yan & Associates, Creating Harmony’.

Just as well hardly anyone uses Bing then.

Things have improved at Google after two months, with this personal site at number two, after Wikipedia (still disappointing, I must say) and the business at 15th (very disappointing, given that it’s been at that domain since 1995).

Surely my personal and work sites are what people are really looking for when they feed in my name?

The wisdom still seems to be to not adopt HTTPS if you want to retain your positions in the search engines. Do the opposite to what technologists tell you.
 
Meanwhile, Vivaldi seems to have overcome its bug where it shuts down the moment you click inside a form field. Version 5.3 has been quite stable so far, after a day, so I’ve relegated Opera GX to back-up again. I prefer Vivaldi’s screenshot process, and the fact it lets me choose from the correct directory (the last used) when I want to upload a file. Tiny, practical things.

Big thanks to the developers at Opera for a very robust browser, though it should be noted that both have problems accessing links at Paypal (below).

We’ll see how long I last back on Vivaldi, but good on them for listening to the community and getting rid of that serious bug.
 

Tags: , , , , , , , , , , ,
Posted in internet, marketing, technology | No Comments »


Testing the search engines

30.11.2010

Blekko

I hadn’t heard of Blekko, a search engine, till last week, so armed with a new entrant, I wanted to see how they all compared.
   Blekko’s very pretty, and I’ve told Gabriel Weinberg, the man behind Duck Duck Go, just what it is that makes it attractive. Most of it is the modernist design approach it takes. But is it more functional?
   I have a couple of tests. You may have heard me dis Google’s supplemental index, where pages it deems to be less important wind up. But who makes that determination? And what if there is a page in there that is actually relevant but Google fails to dig it up?
   Google says the supplemental index doesn’t exist any more, but the fact remains that it fails to dig up some pages, especially older ones. So much for its comprehensive index.
   The first test, therefore, is one I have subjected every search engine I encounter to: will it find a 2000 article on Lucire about Elle Macpherson Intimates’ 10th anniversary? It is probably the only article on the subject, and because of this test, I’ve even linked it this year so it can be spidered by the search engines. Last month, Google could not find it, though in 2000–1, it was very easily found.
   If the search engines are as intelligent as their makers claim, it should be able to figure out these concepts and deliver the pages accordingly. The page itself is very basic with no trick HTML—just plain old meta data, as you would imagine for a ten-year-old file.
   Will the search engines find it now, with a few more inward links?

Duck Duck Go: 1st
Blekko: not found, though it locates a reference made on this blog and two others in Lucire, one going back to 2001, at positions 1, 2 and 12
Google: 73rd, with blog entries from here referring to it at 5 and 42, and another link in Lucire at 6
Bing: 1st with old frameset at 2nd
Ask: 7th

   Here’s the second test. In Wired, Google bragged about how its index could find a page about a certain lawyer in Michigan (mike siwek lawyer mi). Unfortunately for Mr Siwek, most of the top entries quickly became those about the Wired article and he was lost again in the index.
   Mr Don Wearing, a friend of mine, is a partner in a shoe retail chain. If I typed “Don Wearing” shoes, which of the search engines will deliver me an entry referring to Don Wearing specifically and not some guy called Don who happens to be wearing shoes? (Not long ago, the best the search engines could do was around 12th.)

Duck Duck Go: 2nd
Blekko: says ‘No results found for: “Don Wearing” shoes’ but actually finds the article at 5th
Google: 3rd
Bing: 2nd
Ask: 5th

Not bad: an improvement all round.
   OK, how about speed of addition? Let’s see if the search engines will find the last entry in this blog, added a few hours ago. I’ll use the search term “Jack Yan” TPPA.

Duck Duck Go: not found
Blekko: not found
Google: found the main blog page
Bing: found a link to it at MyBlogLog
Ask: not found, but came up with seven irrelevant results

   This is just a quick test based on three examples that might not reflect everyday use. However, the first two frustrated me earlier when I went to hunt for them on Google (and before I had heard of Duck Duck Go), which is why I remembered them, so admittedly Google was at a slight disadvantage in this test as a result. I never went to Bing or Ask regularly.
   Therefore, I’m not going to draw any conclusions about who is best, but I will say that Google is quicker at finding new material. I would, however, encourage others to give these other search engines a go and see how effective they are. I’m very happy with Duck Duck Go, especially as it does not second-guess my queries with Google’s annoying ‘Showing results for [what Google thinks I typed]. Search instead for [what I actually typed]’. No, Google, I did not type my query wrong—so give me the results already!
   I prefer Duck Duck Go’s approach, which is to treat the web more as a research medium. There is no hiding pages: it just delivers the most relevant result to what I typed, which is why I originally moved to Google at the end of the 1990s.
   Judging by the above, I’m not convinced Blekko is ready for prime-time (which is why it still has a beta tag).
   Of the five tested, it looks like it’s still the Duck for me, complemented with Google News. I’m way more impressed with Duck Duck Go’s privacy policy: no search leakage, no search history, and no collecting of personal information to hand over to law enforcement or, for that matter, the Chinese Politburo.
   And in a year where people have shown that they care about privacy, Duck Duck Go seems to make more sense.

Tags: , , , , , , , , , ,
Posted in business, design, internet, technology, USA | 2 Comments »


It’s hard finding the old stuff on Google

26.02.2010

My Wired for March 2010 arrived today (things take a while to reach the antipodes), with the most interesting article being on the Google algorithm. And hold on, this isn’t a Google-bashing blog entry.
   Steven Levy’s article was probably written before the furore over the Google Buzz privacy flap. And it points out how Google has learned from users for search, producing more relevant results than its competitors. With 65 per cent of the search market (and close to 100 per cent of my searches for many years), it has a bigger pool to learn from, too.
   Recently I have noticed in ego-searches that Google is now smart enough to distinguish between searches for yours truly and those for Jack Yan & Associates (both in quotes), so that the former results in a mere 53,800 references, and the latter with 124,000 (quite a bit down from yesterday, when I first hatched the idea about blogging this topic). That is smart in itself: knowing when people are looking for me (or my blog) and when they seek the company. By comparison, Yahoo! lists 280,000 for the former and 42,500 for the latter, as the latter is (if you look at terms alone) a more specific search.
   Once upon a time—even as late as 2009—a search for my name would result in both my personal and work sites.
   I’m pretty proud of my company and the people who work with me, and in election year, if someone were checking out my background, I sure would not mind them getting to JY&A as well. On the other hand, thanks to this distinction, my mayoral campaign site comes up in the top 10 in a search for my name. Either way, it’s relevant to a searcher—so all is well.
   But is this really how people search? If I were searching for, say, Heidi Klum, I would probably want (I write this before I even attempt a search) her bio, a bit of news, pictures to ogle, and Heidi Klum GmbH, her company. This is exactly what Google delivers, with her Wikipedia entry in addition (as the first result). (Bing does this, too; Yahoo! puts Heidi Klum GmbH at number one.) Maybe someone could get back to me on their expectations for a name search although, as I said, Google is doing me a huge political favour by distinguishing me from my business. The ability to distinguish the two is, by all accounts, clever.
   Levy cites an example in his article about mike siwek lawyer mi which, when fed into Google at the time of his writing, gets a page about a Michigan lawyer called Mike Siwek. On Bing, ‘the first result is a page about the NFL draft that includes safety Lawyer Milloy. Several pages into the results, there’s no direct referral to Siwek.’ (A Bing search today still does not have Mr Siwek appear early on; in fact, most now discuss Levy’s article; sadly for Mr Siwek, the same now applies on Google, with the first actual reference to his name being the 18th result. Cuil, incidentally, returns nothing—so much for supposedly having a Google-busting index size.)
   But I have one that is puzzling to me. Ten years ago, Lucire published an article about the 10th anniversary of the Elle Macpherson Intimates range. One would think that the query “Elle Macpherson Intimates” “10th anniversary” would bring this up first—in fact, I did have to search for the URL last year when writing a blog post. On Google, this is, in fact, the last entry. On Bing, it is the first. On Yahoo!, it is second.
   Of course, Google may well have judged the Lucire article to be too old and that the overwhelming majority of searches is for current or recent information. And being 10 years old, I hardly imagine there to be too many links to it any more. However, I thought the fact that we can now, very easily, sort our searches by date—especially with the new layout of the results’ page—it might just give us the most precise result. The lead page to the article is in frames (yes, it’s that old), which may have been penalized by Google. But many of the leading results that turn up that have these two terms do not have them with great proximity (in fact, numbers one and two do not even have the term Elle Macpherson Intimates any more). However, I don’t think the page I hunted for should be last, especially as none of the preceding entries even have the words in their title.
   I am not complaining about the Google situation since a 2009 Lucire article that links to the old Elle Macpherson one comes up in the top 10, so it’s still reasonably easy to get to via the top search engine. (Cuil lists the 2009 article from Lucire in its top 10, too.) There’s also a blog entry from me that links it, and that appears on the second page.
   It’s just that I hold a belief that many people who search using Google (or any search engine) do so for research. They want to know about Brand X and, sometimes, about its history. If I type a person’s name, there is a fairly good chance I want to know the latest. But when I qualify that name with something that puts it in the past (anniversary), then I’d say I want something historical. That includes old pages.
   While few rely on a fashion magazine for historical research (though, believe me, we get queries from scholars who want citations of things they saw in Lucire), Google results nos. 1 through 53 and the majority of Cuil’s results (which are very irrelevant—the first two are of a domain that no longer exists and a blank page) don’t hit the spot.
   For the overwhelming majority of searches—well over 90 per cent—Google serves me just fine, which is why you don’t see me complain much about the quality of its results. Even here, it’s not so much a complaint, but professional curiosity. It would be sad for Bing or Yahoo! to be labelled as search engines for historical searches, but someone should fairly provide access to the older, yet still relevant, pages on the internet for everyday queries (so I don’t mean the Internet Archive).

PS.: There’s one more search engine that should be considered. Gigablast, which I have used on and off over the years, does not list the 2000 article, either. Like Google, the 2009 one is listed, and only five results are returned.—JY

Tags: , , , , , , , , , , , , , ,
Posted in internet, politics | 1 Comment »