Posts tagged ‘search engine’


Testing the search engines: Bing likes antiquity; most favour HTML over PHP

21.09.2022

Bing is spidering new pages, as long as they’re very, very old.

Last week, we added a handful of Lucire pages from 1998 and 1999. An explanation is given here. And I’ve spotted at least two of those among Bing’s results when I do a site:lucire.com search.

As a couple of newer pages have also shown up, I doubt there’s any issue with the template; and the home page now also appears, too. But, by and large, Bing is Microsoft’s own Wayback Machine, and most of the Lucire results are from the 1990s and early 2000s.

It got me thinking: do the other search engines do this, too? For years, Google grandfathered older pages and they came up earlier. (Meanwhile, searches for my own name still have this site, and the company site, down, having lost first and second when we switched from HTTP to HTTPS in March. Contrary to expert opinion, you don’t recover, at least not quickly.)

As Lucire includes the date of the article in the URL, this should be an easy investigation. We’ll only do the first 50 results as that’s all Bing’s capable of. I’ll try not to include any repeat results out of fairness. ‘Contents’ pages’ include the home page, the Lucire TV and Lucire print shopping pages, and tag and category pages.
 
Bing
Contents’ pages ★★★
1997
1998
1999 ★★★★
2000 ★
2001 ★★★★★★★★
2002 ★★
2003 ★★★
2004 ★★★★
2005 ★★
2006
2007 ★★★
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018 ★
2019 ★
2020
2021
2022
 
Google
Contents’ pages ★★★★★★★★★★★★★
1997
1998
1999
2000
2001
2002 ★★
2003
2004 ★★
2005
2006
2007 ★
2008
2009
2010 ★
2011 ★★★
2012 ★
2013 ★★
2014 ★★★
2015 ★
2016 ★★
2017 ★
2018 ★★★
2019 ★★★
2020 ★★★★★★★
2021 ★
2022 ★★★★
 
Mojeek
Contents’ pages ★★★★★★
1997
1998
1999
2000
2001
2002
2003
2004 ★
2005
2006
2007
2008
2009 ★
2010 ★★
2011 ★★
2012 ★★★
2013 ★★★★
2014 ★★★
2015 ★★★★★
2016 ★★★★★★★
2017 ★★★★★★
2018 ★★★
2019 ★★★★
2020 ★★★
2021
2022
 
Baidu
Contents’ pages ★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018 ★
2019 ★
2020
2021 ★★★
2022 ★
 
Yandex
Contents’ pages ★★★★★
1997
1998
1999 ★★★★★
2000 ★★★★★★
2001 ★★★
2002 ★★★
2003 ★★★
2004 ★
2005
2006
2007 ★★★★
2008 ★★
2009 ★★
2010 ★★★★
2011 ★★★
2012 ★★
2013 ★
2014 ★★
2015
2016
2017
2018
2019
2020 ★★★
2021 ★
2022
 

To me, that was fascinating. My instincts weren’t wrong with Bing: it’s old and it favours the old (two of the restored articles were indexed). From the first 50 results, 18 results were repeats—that’s 36 per cent. I’m of the mind that Bing is so shot that it can only index old pages that don’t take up much space. New ones have a lot more data to them, generally.

Google does a good job with the top-level and second-level contents’ pages, though there were a few strange tag indices. But the distribution is what you’d expect: people would search for more recent stories. I know we had some popular stories from 2002 that still get hit a lot.

Mojeek has a similar distribution, though it should be noted that you can’t do a blanket site: search. There must be a keyword, and in this case it’s Lucire. The 2016 pages form the mode, which I don’t have a huge problem with; it’s better than the 2001 pages, which Bing has over everything else.

Baidu’s one is crazy as individual stories are seldom spat out in the first five pages, the search engine preferring tag indices, though half a dozen later story pages do make it into its top 50.

Finally, Yandex leans toward older pages, too, including our most popular 2002 piece. It’s the 2000 stories it has the most of among the top 50, and there’s a strange empty period between 2015 and 2019. But at least there is a fairer distribution than Bing can muster.

The other query that I had was whether these search engines were biasing their results toward HTML pages, rather than PHP ones. If that’s the case, then it could explain Bing’s preference for the old stuff (Lucire didn’t have PHP pages till 2008; prior to that it was all laboriously hand-coded, albeit within templates.)
 
Bing
★★★★★★★★★★★★★★★★★★★★★★★★★ HTML
★ PHP
 
Google
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★ HTML
★★★★★★★★★ PHP
 
Mojeek
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★ HTML
★★★★★★★★★★★★★★★★★ PHP
 
Baidu
★★★★★★★★★★ HTML
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★ PHP
 
Yandex
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★ HTML
★★★★★★ PHP
 

I think we can safely say there’s a preference for HTML over PHP. Mojeek brings up a lot of HTML pages after the top 50, even though this sample shows the split isn’t as severe.

Our PHP pages are less significant though: they contain news stories, and these are often ones other media covered, too. But I would have thought some of the more popular stories would have made the cut, and here it’s Mojeek’s distribution that looks superior to the others’. It seems like it’s actually analysing the page content’s text, which is what you want a search engine to do.

Baidu’s PHP-heaviness is down to all the tag indices—rendering it not particularly helpful as a search engine.

On these two tests, Mojeek and Google rank best, and Yandex comes in third. Baidu and Bing are a distant fourth and fifth.

Tags: , , , , , , , , , , ,
Posted in China, culture, internet, media, publishing, technology, UK, USA | No Comments »


Bing hates novelty—it’s really Microsoft’s Wayback Machine

27.08.2022

Bing is still very clearly near death, as this latest site: search shows.
 

 

It manages a grand total of 10 pages from Lucire, and as outlined before, some are pages that have not been linked to for 17 years.

I purposely updated some of the pages Bing had in its limited capacity, and strangely, those have disappeared! Bing doesn’t want anything new, as it appears to be Microsoft’s Wayback Machine.

The fifth result here is a case in point. Some of you may recall lucire.com/about.shtml appearing in all the search engines, including Bing. This is a page last updated in 2004, with some final tweaks in 2012 (I assume for ad code; I don’t recall). It was a page that I decided I would stick on to a new template, since the search engines loved it so much. I copied the text from our licensing site. And, for the sake of online archæology, I put the 2004 page exactly as it was into a file called about-2004.shtml.

Bing must still be alive enough to spider and index the renamed page, but it rejects the revised about.shtml!

It’s similar to what I wrote in mid-August when I updated other ancient pages from the early 2000s: Bing rejected them, including a frameset that now pointed at the latest page!

You may be thinking: obviously, you are doing something wrong with your newer code, Jack, for Bing to favour the old stuff. But look at the fourth result: it’s from 2020, the one “new” page that Bing has managed to index and show. I don’t think we have anything wrong with our code if this page has made it in.

Google happily included the new about.shtml.

A search for Lucire itself on Bing now does include the home page, which is a new development in a search engine that’s limping along. So much for the earlier claim that there were issues with the page that prevented it from appearing.

Tags: , , , , , , , , , , , ,
Posted in internet, media, publishing, technology, USA | No Comments »


Mystery sitemap files in Bing

12.08.2022

I only signed up to Bing Webmaster Tools when investigating why the company site did so poorly in Bing and Duck Duck Go—we now know it was nothing to do with us, and everything to do with a search engine basically disintegrating before our very eyes.

This, too, was interesting, from a screenshot dated July 20, 2022. I never added these sitemaps, and they all pre-date when I signed up to Webmaster Tools. They were all there when I went to the tools for lucire.com. They are not RSS feeds we’ve ever sanctioned, though of course someone could have created them intentionally to follow a subject. Maybe someone at Microsoft?
 

 

You may notice the number of pages: 51. These 51, however, have no real bearing on the 50-odd that Bing can display before it craps out.

I’ve since added sitemaps for the rest of the site, to no avail, natch.

Anyone else find weird sitemap files in their account after signing up?

Tags: , , , ,
Posted in internet, technology | No Comments »


Updating old pages since the experts are wrong

12.08.2022

With all the odd results coming up in site searches—it’s not restricted to Bing—I attended to some of the older pages on our websites.

Curiously, in a site:lucire.com search, even Google has our 2005 competition page up high, namely in fifth. There is only one link from our site internally to this page. I know of none externally. The idea about Backrub and “link juice” doesn’t ring true here as there is no way that page should be ranked so highly.
 


Top: Google has our 2005 competition page ranked very highly despite it being a redirect. Above: Internally, only one file refers to it, dating from the 2000s.
 

Not only that, it’s a page that refreshes to another on the site—so much for these being lowly ranked and that search engines don’t like them.

Nevertheless, as it’s not relevant or useful any more, I deleted it (though it remains in Google at the time of writing).

The ‘About’ page I’ve discussed before and it remains in fourth, despite not being linked from anywhere recent on our site. It was updated with text from our licensing website and now also follows the rest of the site—though we haven’t bothered making any new links to it. It’s really just for the search engines. (For nostalgia’s sake, it has a link to the 2004 page that the search engines love so much.)

We had so many frameset pages on the Lucire site that I updated a few of those, though—rightly or wrongly—I left the frames intact. Well, if they rank so highly, contrary to what the experts all say, then why not?

The one that had the most surgery, however, was jyanet.com/lucire, Lucire’s original URL in 1997. That still comes up in 23rd for me in Google (for the search Lucire), and 20th in Startpage. This hasn’t been linked to since 1998 by us, and I doubt very many outside of our company would. It was our home only for about six months after launch.

Given its enduring popularity, we’ve given it a Bootstrap template and it shares a stylesheet with the rest of the Lucire site, despite it being at another domain. It now contains links to other Lucire sites, which seems a fitting “gift” to the page as we celebrate our 25th anniversary.
 

Tags: , , , , , , , , , ,
Posted in business, internet, New Zealand, publishing, technology, USA | No Comments »


Attempting re-entry into Bing’s Pubhub

08.08.2022

In early July, I wanted to see if we could add Lucire to Bing as a news source in their Pubhub—after all, Google has us as one, as Yahoo, Altavista and Excite had back in the day. And I’d say that 25 years of publishing with an international team might qualify us as being media.

The folks came back rejecting us, saying we needed to come back in a month’s time. Usual story: look at our rules, you must have messed up.

Bing tells everyone this these days, because it’s a good way to keep webmasters confounded as they try to figure out what’s wrong with their site and why they can’t get it listed. It’s the same with Pubhub.

The one “rule” that might be very broadly interpreted in their favour was that articles needed to have bylines. Granted, a lot of news ones don’t, since sometimes we don’t want credit for them, and you don’t always see a reporter’s name for shorter, simpler items. But features do have bylines. And when Bing swung round in early July, coincidentally I had written quite a lot of the last bunch of articles, so my name was all over them. That was a no-no.

So here we are, a month and a few days on. The home page (the one that Bing declines to include in their index now, as it prefers pages from the early 2000s that we haven’t linked to for over 17 years) contains articles from me, Stanley Moss, Lola Cristall, Jody Miller, and Elyse Glickman. There’s one story on Panos Papadopoulos that he wrote in the first person.

What’s the bet that nothing will happen?

Sometimes you have to give it a go, even when you know nothing will happen—just to prove a point.
 

Above: The top pages in a site:lucire.com search on Bing. Five of these pages we haven’t linked to in 17 years. As a search engine, it makes absolutely no sense.
 
I was surprised, however, that Bing claims to have 330 results for site:lucire.com today, up from 10. It’s still a tenth of what Mojeek has, and a twentieth of what Google has. But it is an improvement. Maybe the worst is over?

It’s still useless as a general search though, and even more useless as an internal search. The fact that popular pages are excluded and 17-year-old ones aren’t means something remains very wrong with the search engine.
 
PS. (August 9 NZST): I spoke too soon. Bing says 330 results, but try looking beyond 50, which was what it tended to cap Lucire at.
 

Tags: , , , , , , , , ,
Posted in internet, media, publishing, technology | No Comments »


Bing has tanked

24.07.2022

Well, folks, here’s someone who’s done the maths. The stats in the last post suggested as much but the sample was so small.

Maurice de Kunder at WorldWideWebSize.com has a definitive graph:
 

 

His methodology is explained at his site.

I’d say late May or early June was when I noticed Duck Duck Go queries on Lucire become largely useless. After a month of seeing no improvement, I began looking into alternatives.

No one knows why, since Bing’s not going to admit any of this. If I was Duck Duck Go, I’d be looking into alternatives smartly. Anyone want to get in touch with Alltheweb and Inktomi? Their indices in the early 2000s were bigger than this.
 
PS.: I tried to tell the SEO sub-Reddit, but no joy. It was immediately removed.
 

 
The original text:

Since June I noticed that our internal site:domain.com searches powered by Duck Duck Go were not returning many results any more. As DDG is powered by Bing, I checked it out there, and, sure enough, we dipped from thousands of entries to 50 (and even 10 at one point). This is a 25-year-old site with decent inbound links.

I did a lot of investigating which I wrote up on my own blog (which I won’t link here due to sub-Reddit rules) and came across this website, which seems to suggest Bing has tanked. The person who runs it is pretty clued up on statistics.

I have run a small sample of 10 sites through the search engines as well and these back up their findings.

At this rate, Bing is smaller than Inktomi and Alltheweb in the early 2000s. What strikes me as weird is that all the Bing licensees haven’t done anything, either, so Duck Duck Go, Ecosia, Qwant, and Onesearch have all shrunk, too. (Swisscows is still reasonably sized.)

Anyone else been through something similar in the last two months?

Why don’t they wish to know? I would have thought this was rather serious for an SEO group.

Tags: , , , , , , , , , ,
Posted in internet, technology | 6 Comments »


More signs of Bing’s tiny index

24.07.2022

Because I have OCD, one more round of stats.

It’s not just us: Bing seems to have a reduced index for everyone. Here are a handful of sites that I fed in at random for site: searches. The only site where it beats Mojeek in indexed pages is, you guessed it, Microsoft’s. I guess since Google favours Google’s own results, Bing does a better job indexing Microsoft’s—and I doubt it’s because their own people conform to Bing’s applied-when-they-choose rules.
 
Die Zeit
Google: 2,600,000
Mojeek: 4,796 (0·18 per cent of Google’s total)
Bing: 3,770 (0·15 per cent of Google’s total)
 
Annabelle (Switzerland)
Google: 11,700
Mojeek: 405 (3·46%)
Bing: 105 (0·90%)
 
Holly Jahangiri
Google: 738
Mojeek: 222 (30·08%)
Bing: 49 (6·64%)
 
The Gloss (Ireland)
Google: 19,200
Mojeek: 1,968 (10·25%)
Bing: 71 (0·37%)
 
The New York Times
Google: 36,200,000
Mojeek: 2,823,329 (7·80%)
Bing: 1,190,000 (3·29%)
 
Lucire
Google: 6,050
Mojeek: 3,572 (59·04%)
Bing: 50 (0·83%)
 
The Rake
Google: 11,500
Mojeek: 1,443 (12·55%)
Bing: 49 (0·43%)
 
Travel & Leisure
Google: 28,100
Mojeek: 9,750 (34·70%)
Bing: 220 (0·78%)
 
Microsoft
Google: 122,000,000
Bing: 14,200,000 (11·64%)
Mojeek: 1,748,199 (1·43%)
 
Detective Marketing
Google: 998
Mojeek: 579 (58·02%)
Bing: 51 (5·11%)
 

In the earlier Microsoft thread I linked, the original poster found that after they joined Bing Webmaster Tools and imported their Google data, that’s when their site vanished from Bing. So, again, we’re not alone.

I’d seriously be rethinking my business model if I was running a search engine that was reliant on Bing.

Tags: , , , , , , , ,
Posted in internet, media, publishing, technology, USA | No Comments »


Not alone in discovering Bing is broken

24.07.2022


MIA again on Bing: Lucire’s home page. The alt tags are not missing, with perhaps some exceptions for small logos. And not having an H1 tag is not fatal to other pages of ours that have been indexed. It remains bizarre.
 
After Holly Jahangiri’s very useful feedback to my previous post, I thought I’d give the search engines she sampled a go for site:lucire.com.

Bear in mind that Duck Duck Go, Ecosia, Qwant, Onesearch and Swisscows license from Bing, and Startpage picks up Google, so their indices will reflect the mothership.

Here’s how we look today. Bing remains well and truly beaten by Google, Mojeek, Baidu and Yandex.
 
Google: 6,100
Mojeek: 3,569
Swisscows: 498
Baidu: 201
Startpage: 198
Virtual Mirage: 100
Yandex: 94
Bing: 50
Qwant: 50
Duck Duck Go: 49
Ecosia: 49
Brave: 14
Searchencrypt: 8
Searx: 0
Onesearch: blocked in New Zealand
 

I am not alone, it seems. This thread on Microsoft Answers was enlightening. Others in the thread have found themselves gone from Bing (but not Google), and Microsoft appears to know about it, admitting to some fault and escalating the issues internally, but nothing ever gets done.

I had that old theory, blogged about previously, that computer databases get worn after a while. I saw that with Vox, a lot of Facebook’s ills can be put down to it, and maybe Bing has now got there? No tech ever wants to admit it because of how crazy it sounds. But if we can lose data on hard drives and USB sticks, then I don’t care how many back-ups these big firms have, they are still fallible. (What if faults in one database are copied on to another, and the checksums weren’t verified?)

I replied to the Microsoft poster, and it’s a pretty good summary so far:

Hi EbinVThomas, here’s my experience, and I’ve run websites for three decades. The short version is I think Bing is stuffed and it’s not a Microsoft core business, so it doesn’t get much love (indeed, one of their FAQ pages has a heading about ‘seach’). I know the Microsoft fans will attack me for saying this, just as the Apple fans have a go at me when I say something negative about Macs, but I haven’t read anything to change my opinion.

We started vanishing from Bing earlier this year, maybe about three months ago. For some of our sites, I thought it was our belated switch to HTTPS for some of them, but as you’ll read, that wasn’t the case.

These sites date from (at their present domains) 1995, 1997, 2002 and 2008, and they are well linked, well respected, and one has been winning awards from 1997 to today. Google and Mojeek have no problems with any of them. Two of the sites (the 2002 and 1995 ones) did drop from their number-one and two positions on Google (for a name search) when they switched to HTTPS but one has mostly recovered, the other (from 1995, with a lot of inbound links to HTTP) fluctuates.

One of the other sites uses Duck Duck Go for its internal site search (and has done since the 2010s), which is powered by Bing. Earlier this year—say about six weeks ago—I noticed that the internal search was getting more and more useless, even though I knew the articles used to be found by DDG.

I began doing site:domain.com searches for this one. It had c. 50 entries on Bing, down from several thousand earlier in the year.

My first reaction was to blame ourselves—maybe it was the full switch to a secure server (some earlier pages were already on HTTPS), or something else. We also began using Cloudflare again after a 12-year break around this time.

I signed up to Bing Webmaster Tools. The site promptly went down to 10 entries! In other words, signing up to Tools made the site’s presence a lot, lot worse.

I found some weird site maps that I never put in, nor did any of my team. Nevertheless, I put in new, fresh ones last week, all pointing to HTTPS. Most of the pages have not been indexed.

I had to turn off Cloudflare’s IndexNow because it was sending some totally irrelevant and old pages and files to Bing. (So we can blame Cloudflare for some issues, but the majority still rests with Bing.)

Since the new site maps, Bing is now returning 53–5 entries (depending on the hour).

It finally included the home page which had been missing from the site: searches. Yet only yesterday Webmaster Tools said the page was not indexed because of certain issues, but it had been found in 2018. That made no sense as it was present until quite recently. Those issues included a description tag being too long (fine, I edited it), and no H1s (but why should there be? Not everyone wants humungous type on their page). But Bing had been fine historically with the page (since Bing started, so well before 2018) and it even appeared in the index during the last few weeks. A related page for our business doesn’t have H1s, an even longer meta description, and it’s on Bing. (It’s just not been entered into Webmaster Tools, which seems to be a kiss of death!)

Webmaster Tools even said it had accepted the site maps and the thousands of pages listed.

As far as I can make out, Webmaster Tools says one thing but reality says another.

So, was it Cloudflare and HTTPS that had knocked us? Well, no. Of the four sites I mentioned, we didn’t change the set-up of the one started in 2008. It’s a reference site, and has plenty of inbound links from Wikipedia since it’s fairly authoritative.

No Cloudflare, and still on HTTP. All fine on Google and Mojeek.

Also thousands of pages.

On Bing: 51 pages.

Thousands of entries have vanished since earlier this year, and I’m going to hazard a guess to say it began happening around the time you wrote your original post.

It has had a slight impact on our traffic, especially since we had promoted Duck Duck Go so heavily since 2010 and encouraged others to shift from Google to it.

It seems that Bing can now only cope with 50-odd pages from certain sites. The older sites have fewer pages indexed now on Bing than they did on Excite or Hotbot in the 1990s, and certainly far fewer than Altavista! Our sites are so incredibly varied—static, dynamic, HTML, PHP—so it can’t be structural or the way we have set things up. None have had issues at Google other than one that dropped in the index for a certain relevant search, and Mojeek is fine with them all and took the HTTPS shift for three of them in its stride.

These are such old sites with a history in Bing, so my feeling is that a new site won’t stand much of a chance.

This is a long way of confirming your original post: it’s not you, it’s them.

Tags: , , , , , ,
Posted in internet, technology, USA | 3 Comments »


Bing is definitely very broken, and it’s hurting Duck Duck Go

23.07.2022

The last few days have been about ‘How awesome is Mojeek?’ and ‘How shit is Bing?’

I’m finding great search results from Mojeek, and as a site search for Lucire, it’s absolutely brilliant. Blows Duck Duck Go (Bing with privacy) away, even back when DDG had a reasonably comprehensive index of our pages (before the HTTPS switch). I don’t have to subject anyone to Google tracking, and I didn’t have the hassle of installing an internal search ourselves.

Cisene, who I met via Mastodon, very helpfully suggested on that social network that I submit site maps for the Lucire website as that would take a reasonably short time to remedy Bing’s ills. I’ve never had to do them for Google or Mojeek: their spiders work as they have always done since the dawn of search engines. For some reason, Bing needs its hand held if I want it to have thousands of pages again, as it did earlier this year.

One thing I found curious with Bing is its insistence, in a site search, to place a page that we have not linked to since 2005 at the very top. Of course I could delete the page or program in a forwarder, or make a 301, but I was also once told that dead links and forwarders were bad things for search engines. Our ‘About’ page also ranks highly in all search engines, despite not being linked to in anything we’ve done in over 15 years as well.

But where’s the home page? Happily, after submitting site maps, Bing’s index of our pages went from 10 to a whopping 55, and the home page appeared for the first time in a site:lucire.com search:
 

 

‘It’s an improvement,’ I thought, though the search engine is still massively handicapped compared to where it was at the start of 2022.

Checking on Bing Webmaster Tools to see where things were, I was curious to see it claim that it could not crawl or index our home page though it was discovered in 2018:
 

 

But you just crawled and indexed it. Which is it?

The excuses this time (as Big Tech people love to make stuff that blames users) are that there are no <H1> tags (I’ve got news for you, Bing: we don’t use them, and why should we? There was never any rule that stated that headlines must be between them, and no one else seems to care) and that the description is too long (again, it was fine for you before—and actually you’ve just shown that it is fine).

They aren’t in the business of search though, as their explanations reveal. It’s seach:
 

 
Goodness knows how many years that’s been there, ignored.

It’s all so slap-dash and unprofessional, and as Duck Duck Go search results are based on Bing’s, I’m going to have to stop recommending it. Fortunately, I found Mojeek at the perfect time.

I’m also discovering that maybe Bing can no longer handle more than 50-odd pages per site anyway, which, of course, makes it useless as an engine that powers a site search. (Like I keep saying, the defunct Excite in the 1990s could do better. Any search engine from those days could spider and index more effectively.) It would be in line with other Microsoft products, such as Notepad, where the software giant now prevents us from typing £ or , except, presumably, people from the countries where those are the common, keyboard-accessible currency symbols. Want to write Cæsar drinks Nescafé? You can try, but the diphthong and é will be missing.

Today I searched site:autocade.net on Bing. Now, we never switched Autocade to HTTPS. After how all our sites fell, would you risk it? This site is dependent on search-engine traffic.

And here are the number of pages each search engine brings up for a site search.
 
Google: 4,080
Mojeek: 3,348
Bing: 51
Duck Duck Go: 50
Brave: 17 (plus 4 underneath first entry)
 

So I can’t keep blaming the switch to HTTPS, though our troubles with all search engines I knew of then began around this time. Autocade still slipped in Bing despite no down time; we went to a newer Mediawiki version, but that was about it. Everything progressed as it always did.

Google eventually allowed things to recover (for the most part) with the exception of our company website (which rose up to 13th before dropping to 26th today), Mojeek never even had an issue to begin with, but Bing and Duck Duck Go don’t link to Jack Yan & Associates’ website till after the 40th position.

So where are we now with the sites I last looked at?
 
Number of results for site:lucire.com
Google: 6,250
Mojeek: 3,563
Bing: 53
Duck Duck Go: 53
Brave: 15 (plus 4 underneath first entry)
 
Number of results for site:jackyan.com
Google: 1,860
Mojeek: 438
Duck Duck Go: 54
Bing: 43
Brave: 13 (plus 4 underneath first entry)
 
Number of results for site:jyanet.com
Google: 743
Mojeek: 296
Bing: 49
Duck Duck Go: 49
Brave: 20
 

I honestly think Bing is broken.

Just as well no one I know uses it, but quite a number of people do opt for Duck Duck Go, because of the work it’s done in promoting privacy. I still admire them for this stance. But as many of you know, it sources its results from Bing, so if one is broken, both will be. And that’s a darned shame as I almost hit 12 years of having Duck Duck Go as my default (from August 2010 or thereabouts).

All the more reason to retain Mojeek as my default search engine.

Will I bother looking any more into Bing? Probably not, but how do I convince all those I recommended Duck Duck Go to to check out Mojeek?

Tags: , , , , , , , , , ,
Posted in internet, marketing, publishing, technology, USA | 4 Comments »


Forget Duck Duck Go, Bing, and Google—I’m trying Mojeek

17.07.2022

It was disappointing to note that after switching to HTTPS, and signing on to Bing Webmaster Tools, the search engine results for those sites of ours that made the change are still severely compromised.

I’ve written about searches for my own name earlier, where my personal and company sites lost their first and second positions on all search engines that I knew of after we made the switch. Only Google has my personal site back up top, with the company site on the middle of the second page. Bing has my personal site at number two, and I’d love to tell you where the company site is, but their search engine results’ pages won’t let me advance beyond page 2 (clicking ‘next page’ lands you back on the same page; clicking ‘3’ and above still keeps you on p. 2). Duck Duck Go, which uses Bing results, has it well below that—I gave up looking. And this is after I signed up to Bing Webmaster Tools in the hope I could get the sites properly catalogued.

It’s a real shame because Duck Duck Go has been my default for 12 years this August.

However, it was the loss of search results for Lucire that really bothered me. Here’s a site that’s 25 years old, with plenty of inward links, and c. 5,000 pages. Before the switch to HTTPS, the popular search engines had thousands of pages from our site. These days, Bing and Duck Duck Go tell me they have dozens of pages from Lucire’s website. Again, only Google seems to have spidered everything.

When I check Bing Webmaster Tools, the spidering has been shockingly poor.

The received wisdom that you should have HTTPS instead of HTTP to do better in search engines is BS, and the belief that search engines will eventually catch up has also not been realized. We made the switch in March, and I’m to believe that Bing hasn’t completed the indexing of our sites.

Are they using the same computers New Zealand banks do? (Cheques used to clear overnight in the 1970s, and now banks tell us that even electronic payments can take days. When we last used cheques, they were telling us they would take five to seven days. Ergo, bank computers are slower today than in 1976.)

The real downer is that Lucire’s website search box is powered by Duck Duck Go, so our own site visitors can’t find the things they want to look for. If you believe some of the search engine marketing, over 40 per cent of site visitors use your search function.

What to do?

I began looking at having an internal search again. We used to have a WhatUSeek (later SiteLevel) internal site search, but that site’s search functions appear to be dead (the site is still live). A user on Mastodon recommended Sphinx Search, an open-source internal site search, but the instructions were too complex. I even saw real computer geeks having trouble. The only one that I could understand was called Sphider—I could follow the instructions and knew enough about PHP and MySql—but it was last updated many years ago, and successive projects also looked a bit complex.

A Google internal search was absolutely out of the question, as I have no desire to expose our readers to tracking—which is why so many other Big Tech gadgets have been removed from our site(s). Baidu and Yandex also have very limited indices for our sites.

I am very fortunate to have tried Mojeek again, a British search engine recommended to me by Matias on July 2. What I didn’t know then was Mojeek has its own spider and its own index, so it doesn’t have to license anything from Bing. And, happily, it claims to have 3,535 results from lucire.com, which might not be as good as Google’s 5,830, but it beats Bing’s 50 earlier today—in fact, at the time of writing, it showed a grand total of 10. That’s how bad it’s got. Duck Duck Go now has 48, also down from a few thousand before March.

Like Google, it seems to have coped with the switch to HTTPS without falling to pieces! And guess what? For a search of my own name, my personal site is number one, and our work site is number two. Presumably, Mojeek is the only search engine which coped and behaved exactly as the experts said!

You can imagine my next move. Mojeek has a site search, so now all Lucire searches are done through it. And readers can actually find stuff again instead of coming up nearly empty (or having very irrelevant results) as they have done for months.

Duck Duck Go’s lustre had been wearing off as there were recent allegations that its browser allowed Microsoft to track its users, something which Duck Duck Go boss Gabriel Weinberg personally denied on Reddit, saying that users were still anonymous when loading their search results.

I still have good memories of chatting to Gabriel in the early days and figuring out ways of spreading the word on Duck Duck Go. My contribution was going to hotels and changing the search defaults on business centre computers. Back then I had the impression Duck Duck Go did some of its own spidering, but these days, if Bing has a shitty index for your site, the Duck will follow suit. And with HTTPS not living up to its promise, that’s simply not good enough.

Tonight, Mojeek is very much the site of the day here, and I heartily recommend you try it out. I’ve switched the desktop to Mojeek as a default, and I’ll see how it all progresses. Right now I feel it deserves our support more than Duck Duck Go. Finally, we might truly have an alternative to Google, and it’s run from the UK’s greenest data centre. With our servers now being greener, too, running out of Finland, the technology is starting to match up to our beliefs.
 

Google, the biggest index of them all
 

Mojeek, a creditable second place
 

This is it on Bing: a 25-year-old history on the web, and it says it has 10 pages from lucire.com. Altavista, Excite and Hotbot had more in the 1990s
 

Duck Duck Go is slightly better, with 48 results—down from the thousands it once had
 
After switching to HTTPS
Number of results for lucire.com
Google: 5,830
Mojeek: 3,535 (containing the word Lucire, as term-less searches are not allowed)
Duck Duck Go: 48
Bing: 10
 
Number of results for jackyan.com
Google: 878
Mojeek: 437 (containing the term “Jack Yan”)
Duck Duck Go: 54
Bing: 24
 
Number of results for jyanet.com
Google: 635
Mojeek: 297 (containing the word jyanet)
Duck Duck Go: 46
Bing: 10
 

Presumably the only search engine that could handle a server going from HTTP to HTTPS and preserving the domains’ positions

Tags: , , , , , , , , , , , , , , , , ,
Posted in business, internet, publishing, technology, UK | 1 Comment »