Posts tagged ‘Microsoft’


Mystery sitemap files in Bing

12.08.2022

I only signed up to Bing Webmaster Tools when investigating why the company site did so poorly in Bing and Duck Duck Go—we now know it was nothing to do with us, and everything to do with a search engine basically disintegrating before our very eyes.

This, too, was interesting, from a screenshot dated July 20, 2022. I never added these sitemaps, and they all pre-date when I signed up to Webmaster Tools. They were all there when I went to the tools for lucire.com. They are not RSS feeds we’ve ever sanctioned, though of course someone could have created them intentionally to follow a subject. Maybe someone at Microsoft?
 

 

You may notice the number of pages: 51. These 51, however, have no real bearing on the 50-odd that Bing can display before it craps out.

I’ve since added sitemaps for the rest of the site, to no avail, natch.

Anyone else find weird sitemap files in their account after signing up?

Tags: , , , ,
Posted in internet, technology | No Comments »


What search engines show in their top 10 isn’t always relevant

09.08.2022

The Bing collapse did lead me to look at some of the ancient pages on the Lucire site that the search engines were still very fond of. For instance, the ‘About’ page was still appearing up top, which is bizarre since we haven’t made any links to it for years—it reflected our history in 2004.

Naturally, once I updated it, it promptly disappeared from Bing! Too new for Microsoft’s own Wayback Machine!

I was always told that you shouldn’t delete old pages, and that 301s were the best solution. I’m enough of a computing neophyte to not know how to implement 301s (.htaccess doesn’t work, at least not on our set-up) and page refreshes are often frowned upon, which is why so many old pages are still there.

However, you would naturally expect that a web spider following links would not rank anything that hasn’t been linked to for over a decade very highly. If the spider comes in, picks up the latest stuff from your home page, possibly the latest stuff from individual topic pages, it would figure out what all of these were linking to, and conclude that something from 2000 that was buried deep within the site was no longer current, or of only passing interest to surfers.

I realize I’ve had a go at search engines for burying relevant things in favour of novel things, but we’re talking pages here that aren’t even relevant. ‘About’ I’ll let them have, but a 2000 book reviews’ page? A subject index page from 2005 that hasn’t been linked to since 2005, and the pages that do are well outnumbered by newer ones? Because, the deletion of ‘About’ aside, here is what Bing thinks is the most important for site:lucire.com:
 

 

Google fares a little better. Our home page and current print edition ordering page are top, shopping is third, followed by the fashion contents’ page (makes sense). ‘About’ comes in fifth, for whatever reason, then a 2005 competition page that we should probably delete (it refreshes to another page from 2005—so much for refresh pages being bad for search engines).

Seventh is yet another ancient page from 2005, namely a frameset—which I’ve since updated so at least the main frame loads something current. The remainder are articles from 2011, 2022 and 2016. The next page comprises articles and tags, which seem to make sense.

Mojeek actually makes more sense than Google. Home page in first, the news page (the next most-updated) is second, followed by the travel contents’ page. Then there are two older print edition pages (2020 and 2012), followed by a bunch of articles (2013, 2014, 2013, 2013), and the directory page for Lucire TV. There’s nothing here that I find strange: everything is logically found by a spider going through the site, and maybe those four articles from the 2010s are relevant to the word Lucire (given that you can’t do site: searches on Mojeek without a keyword, so it repeats the word before the TLD)? The reference to the 2012 issue might be down to my having mentioned it recently during our 25th anniversary posts. But there are no refresh pages and no framesets.

Startpage, not Google, has a couple of frameset pages from 2000 and 2002 in their top 10 which again weren’t linked to, at least not purposefully (they were placed there to catch people trying to look at the directory index in the old days). There’s incredibly little “link juice” to these pages. However, ‘About’ (in 10th), and these two framesets aside, its Google-sourced results fare remarkably well. In order: home page, print edition ordering page, the two framesets, the news section, the shopping page (barely updated but I can see why it’s there), the community page, Lucire TV, the fashion contents, ‘About’.

Duck Duck Go is so compromised by Bing that it barely merits a mention here. Four pages from 2000 and 2005 that no current page links, a 404 page that we’ve never even had on our site (!), articles from 2021, 2018, 2007 and 2000 (in that order), and a PDF (!) from 2004. Fancy having a 404 that never even existed in the top 10!

If I had my way, it’d be home page, followed by the different sections’ contents’ pages, then the most popular article—though if a couple of articles go (or went) viral, then I’d expect them sooner.

Both Mojeek and Google do well here, with four of these pages each in their top 10s. But it’s Startpage’s unfiltered Google results that do best, hitting linked, relevant pages in seven results out of the top 10. Bing and its licensees miss the mark completely. If you must have a Google bias, then Startpage is the way to go; for our purposes, Mojeek remains the better option.
 
★★★★★★★☆☆☆ Startpage
★★★★☆☆☆☆☆☆ Mojeek
★★★★☆☆☆☆☆☆ Google
★★☆☆☆☆☆☆☆☆ Virtual Mirage
★☆☆☆☆☆☆☆☆☆ Baidu
★☆☆☆☆☆☆☆☆☆ Yandex
☆☆☆☆☆☆☆☆☆☆ Bing
☆☆☆☆☆☆☆☆☆☆ Qwant
☆☆☆☆☆☆☆☆☆☆ Swisscows
☆☆☆☆☆☆☆☆☆☆ Brave
☆☆☆☆☆☆☆☆☆☆ Duck Duck Go (would give –1 for the 404 if I could)

Tags: , , , , , , , , , , , ,
Posted in France, internet, New Zealand, publishing, technology, UK, USA | No Comments »


Attempting re-entry into Bing’s Pubhub

08.08.2022

In early July, I wanted to see if we could add Lucire to Bing as a news source in their Pubhub—after all, Google has us as one, as Yahoo, Altavista and Excite had back in the day. And I’d say that 25 years of publishing with an international team might qualify us as being media.

The folks came back rejecting us, saying we needed to come back in a month’s time. Usual story: look at our rules, you must have messed up.

Bing tells everyone this these days, because it’s a good way to keep webmasters confounded as they try to figure out what’s wrong with their site and why they can’t get it listed. It’s the same with Pubhub.

The one “rule” that might be very broadly interpreted in their favour was that articles needed to have bylines. Granted, a lot of news ones don’t, since sometimes we don’t want credit for them, and you don’t always see a reporter’s name for shorter, simpler items. But features do have bylines. And when Bing swung round in early July, coincidentally I had written quite a lot of the last bunch of articles, so my name was all over them. That was a no-no.

So here we are, a month and a few days on. The home page (the one that Bing declines to include in their index now, as it prefers pages from the early 2000s that we haven’t linked to for over 17 years) contains articles from me, Stanley Moss, Lola Cristall, Jody Miller, and Elyse Glickman. There’s one story on Panos Papadopoulos that he wrote in the first person.

What’s the bet that nothing will happen?

Sometimes you have to give it a go, even when you know nothing will happen—just to prove a point.
 

Above: The top pages in a site:lucire.com search on Bing. Five of these pages we haven’t linked to in 17 years. As a search engine, it makes absolutely no sense.
 
I was surprised, however, that Bing claims to have 330 results for site:lucire.com today, up from 10. It’s still a tenth of what Mojeek has, and a twentieth of what Google has. But it is an improvement. Maybe the worst is over?

It’s still useless as a general search though, and even more useless as an internal search. The fact that popular pages are excluded and 17-year-old ones aren’t means something remains very wrong with the search engine.
 
PS. (August 9 NZST): I spoke too soon. Bing says 330 results, but try looking beyond 50, which was what it tended to cap Lucire at.
 

Tags: , , , , , , , , ,
Posted in internet, media, publishing, technology | No Comments »


Mojeek shows more in its search results than Google

02.08.2022

This was something I had forgotten when doing the numbers on how many pages each search engine had indexed from our sites: what they claim to be their index size and what they let you access are two different things.

And in Lucire’s case, Google, curiously, mostly does not allow access to our dynamic pages in PHP in its main index, reserving them for Google News. Google News, however, has both PHP and HTML. It’s only when you feed in a specific request for one of our stories that we know is on a PHP-generated page that it comes up in the main index’s results.

Let me explain. Remember this from a blog post in July? These are what the search engines said they had indexed for lucire.com (in a site:lucire.com search). I’ve updated it for August 2 and added one more search engine, Yep, another independent, out of interest.
 
Google: 10,600
Mojeek: 3,593
Duck Duck Go: 50
Brave: 19
Bing: 10
Yep: 10
 

But can you see 10,600? Here’s the reality of what is truly visible at the moment when you browse the results’ pages of each search engine as of today:
 
Google: 304
Mojeek: 1,000
Duck Duck Go: 50
Brave: 19
Bing: 10
Yep: 10
 


Above: Google (top) shows fewer pages than Mojeek in a site: search.
 

Mojeek maxes out at 1,000 by design, but like Google, it will find a specific article outside of the 1,000 shown if searched for. Google conks out at 304 (303 when I first did this test).

The bigger Google index is its advantage, but Mojeek does a fine job by sharing more in its results’ pages than Google does—over three times as many. Another win for the plucky independent out of the UK.
 
While we’re on the subject, notice how small the Bing index is getting, returning just 10 pages for lucire.com? It’s really collapsed in a big way. Feeding in the other sites I tested earlier, Bing shows declines all round, apart from Travel & Leisure.

Fancy having only 2,723 results from The New York Times, down from 1,190,000 on the 24th ult. Mojeek has over 1,000 times more than Bing, and Google over 12 times more than that.

Previous numbers in parentheses below.
 
Die Zeit
Google: 2,710,000 (2,600,000)
Mojeek: 4,891 (4,796)
Bing: 3,268 (3,770)
 
Annabelle (Switzerland)
Google: 11,900 (11,700)
Mojeek: 408 (405)
Bing: 26 (105)
 
Holly Jahangiri
Google: 618 (738)
Mojeek: 236 (222)
Bing: 10 (49)
 
The Gloss (Ireland)
Google: 17,600 (19,200)
Mojeek: 2,009 (1,968)
Bing: 20 (71)
 
The New York Times
Google: 36,500,000 (36,200,000)
Mojeek: 2,879,513 (2,823,329)
Bing: 2,723 (1,190,000)
 
Lucire
Google: 10,600 (6,050)
Mojeek: 3,593 (3,572)
Bing: 10 (50)
 
The Rake
Google: 11,100 (11,500)
Mojeek: 1,445 (1,443)
Bing: 16, but claims 4! (49)
 

 
Travel & Leisure
Google: 33,500 (28,100)
Mojeek: 10,081 (9,750)
Bing: 383 (220)
 
Microsoft
Google: 118,000,000 (122,000,000)
Bing: 1,927,118 (14,200,000)
Mojeek: 1,772,165 (1,748,199)
 
Detective Marketing
Google: 961 (998)
Mojeek: 579 (579)
Bing: 16 (51)

Tags: , , , , , , , ,
Posted in internet, technology, UK, USA | No Comments »


We’ve reached 4,600 models on Autocade

02.08.2022


 
We’ve hit 4,600 models on Autocade, with the Toyota Will VS taking us to this point, but the stats show we are sitting on 1,180,548 views. We have to get to 1,352,989 on the new count before I can announce we’ve reached 29 million page views.

We’re looking at the lowest traffic on Autocade since 2019, and I’m sure the collapse of the Bing index, taking down the indices of all associated search engines (Duck Duck Go, Qwant, etc.), is to blame. I used to see an increase of 100,000 every week, roughly, but not these days. (PS.: I was still observing this level when we first switched the site over, and the slower growth has probably coincided with when WorldWideWebSize.com recorded Bing’s plummet in late May–early June.)

Autocade is the one site where we never changed the set-up, other than hosting provider and Mediawiki version. The other sites had various things done to them, with Cloudflare and HTTPS. So given the “invisible” changes—changes we had done before in years gone by—we know “it’s not us, it’s them”.

I’ve listed the three Will models (or WiLL to use the original styling) as Toyotas after I confirmed this with another motorhead, the very knowledgeable Atsuhiro Takeda. They were also always listed as Toyotas by Auto Katalog many years ago, and I believe also by Toutes les voitures du monde. Atsu confirmed that that was how he believed they should be indexed. I’ve had those Will publicity images for a long time and it’s nice they’ve finally gone online in Autocade.

The only oddity in the Autocade stats is the rise in hits for our page on the Kia Morning (TA), coming from nowhere and into sixth place among model pages. Whomever the Morning fans are, I thank you!

Tags: , , , , , , , , , ,
Posted in cars, internet, publishing | No Comments »


Bing has tanked

24.07.2022

Well, folks, here’s someone who’s done the maths. The stats in the last post suggested as much but the sample was so small.

Maurice de Kunder at WorldWideWebSize.com has a definitive graph:
 

 

His methodology is explained at his site.

I’d say late May or early June was when I noticed Duck Duck Go queries on Lucire become largely useless. After a month of seeing no improvement, I began looking into alternatives.

No one knows why, since Bing’s not going to admit any of this. If I was Duck Duck Go, I’d be looking into alternatives smartly. Anyone want to get in touch with Alltheweb and Inktomi? Their indices in the early 2000s were bigger than this.
 
PS.: I tried to tell the SEO sub-Reddit, but no joy. It was immediately removed.
 

 
The original text:

Since June I noticed that our internal site:domain.com searches powered by Duck Duck Go were not returning many results any more. As DDG is powered by Bing, I checked it out there, and, sure enough, we dipped from thousands of entries to 50 (and even 10 at one point). This is a 25-year-old site with decent inbound links.

I did a lot of investigating which I wrote up on my own blog (which I won’t link here due to sub-Reddit rules) and came across this website, which seems to suggest Bing has tanked. The person who runs it is pretty clued up on statistics.

I have run a small sample of 10 sites through the search engines as well and these back up their findings.

At this rate, Bing is smaller than Inktomi and Alltheweb in the early 2000s. What strikes me as weird is that all the Bing licensees haven’t done anything, either, so Duck Duck Go, Ecosia, Qwant, and Onesearch have all shrunk, too. (Swisscows is still reasonably sized.)

Anyone else been through something similar in the last two months?

Why don’t they wish to know? I would have thought this was rather serious for an SEO group.

Tags: , , , , , , , , , ,
Posted in internet, technology | 4 Comments »


More signs of Bing’s tiny index

24.07.2022

Because I have OCD, one more round of stats.

It’s not just us: Bing seems to have a reduced index for everyone. Here are a handful of sites that I fed in at random for site: searches. The only site where it beats Mojeek in indexed pages is, you guessed it, Microsoft’s. I guess since Google favours Google’s own results, Bing does a better job indexing Microsoft’s—and I doubt it’s because their own people conform to Bing’s applied-when-they-choose rules.
 
Die Zeit
Google: 2,600,000
Mojeek: 4,796 (0·18 per cent of Google’s total)
Bing: 3,770 (0·15 per cent of Google’s total)
 
Annabelle (Switzerland)
Google: 11,700
Mojeek: 405 (3·46%)
Bing: 105 (0·90%)
 
Holly Jahangiri
Google: 738
Mojeek: 222 (30·08%)
Bing: 49 (6·64%)
 
The Gloss (Ireland)
Google: 19,200
Mojeek: 1,968 (10·25%)
Bing: 71 (0·37%)
 
The New York Times
Google: 36,200,000
Mojeek: 2,823,329 (7·80%)
Bing: 1,190,000 (3·29%)
 
Lucire
Google: 6,050
Mojeek: 3,572 (59·04%)
Bing: 50 (0·83%)
 
The Rake
Google: 11,500
Mojeek: 1,443 (12·55%)
Bing: 49 (0·43%)
 
Travel & Leisure
Google: 28,100
Mojeek: 9,750 (34·70%)
Bing: 220 (0·78%)
 
Microsoft
Google: 122,000,000
Bing: 14,200,000 (11·64%)
Mojeek: 1,748,199 (1·43%)
 
Detective Marketing
Google: 998
Mojeek: 579 (58·02%)
Bing: 51 (5·11%)
 

In the earlier Microsoft thread I linked, the original poster found that after they joined Bing Webmaster Tools and imported their Google data, that’s when their site vanished from Bing. So, again, we’re not alone.

I’d seriously be rethinking my business model if I was running a search engine that was reliant on Bing.

Tags: , , , , , , , ,
Posted in internet, media, publishing, technology, USA | No Comments »


Putting the search engines through their paces

24.07.2022

One more, and I might give the subject a rest. Here I test the search engines for the term Lucire. This paints quite a different picture.

Lucire is an established site, dating from 1997, indexed by all major search engines from the start. The word did not exist online till the site began. It does exist in old Romanian. There is a (not oft-used) Spanish conjugated verb, I believe, spelt the same.

The original site is very well linked online, as you might expect after 25 years. You would normally expect, given its age and the inbound links, to see lucire.com at the top of any index.

There is a Dr Yolande Lucire in Australia whom I know, who I’m used to seeing in the search engine results.

The scores are simply for getting relevant sites to us into the top 10, and no judgement is made about their quality or relevance.
 
Google
lucire.com
twitter.com
lucire.net
instagram.com
wikipedia.org
linkedin.com
facebook.com
pinterest.nz
neighbourly.co.nz
—I hate to say it, as someone who dislikes Google, but all of the top 10 results are relevant. Fair play. Then again, with the milliards it has, and with this as its original product, it should do well. 10/10
 
Mojeek
scopalto.com
lucirerouge.com
lucire.net
lucire.com
mujerhoy.com
portalfeminino.com
paperblog.com
dailymotion.com
eldiablovistedezara.net
hispanaglobal.com
Mojeek might be flavour of the month for me, but these results are disappointing. Scopalto retails Lucire in France, so that’s fair enough, but disappointing to see the original lucire.com site in fourth. Fifth, sixth, seventh, ninth and tenth are irrelevant and relate to the Spanish word lucir. You’d have to get to no. 25 to see Lucire again, for Yola’s website. Then it’s more lucir results till no. 52, the personal website of one of our editors. 5/10
 
Swisscows
lucire.net
wikipedia.org
lucire.com
spanishdict.com
lucire.net
lucire.com
drlucire.com
facebook.com
spanishdict.com
viyeshierelucre.com
—Considering it sources from Bing, it makes the same mistakes by placing the rarely linked lucire.net up top, and lucire.com in third. Fourth, ninth and tenth are irrelevant, and the last two relate to different words. Yola’s site is seventh, which is fair enough. 6/10
 
Baidu
lucire.net
lucire.com
lucire.cc
lucire.com
kanguowai.com
hhlink.com
vocapp.com
forvo.com
kuwo.cn
lucirehome.com
—Interesting mixture here. Strange, too, that lucire.net comes up top. We own lucire.cc but it’s now a forwarding domain (it was once our link shortener, up to a decade ago). Seventh and ninth relate to the Romanian word strălucire and eighth to the Romanian word lucire. The tenth domain is an old one, succeeded a couple of years ago by lucirerouge.com. Not very current, then. 7/10
 
Startpage
lucire.com
lucire.com
lucire.net
instagram.com
wikipedia.org
linkedin.com
facebook.com
pinterest.nz
fashionmodeldirectory.com
twitter.com
—All relevant, as expected, since it’s all sourced from Google. 10/10
 
Virtual Mirage
lucire.com
instagram.com
wikipedia.org
lucire.net
facebook.com
linkedin.com
pinterest.nz
lucirerouge.com
nih.gov
twitter.com
—I don’t know much about this search engine, since I only heard about it from Holly Jahangiri earlier today. A very good effort, with only the ninth one being irrelevant to us: it’s a paper co-written by Yola. 9/10
 
Yandex
lucire.com
lucire.net
facebook.com
twitter.com
wikipedia.org
instagram.com
wikipedia.eu
pinterest.nz
en-academic.com
wikiru.wiki
—This is the Russian version. All are relevant, and they are fairly expected, other than the ninth result which I’ve not come across this high before, although it still relates to Lucire. 10/10
 
Bing
lucire.net
wikipedia.org
lucire.com
spanishdict.com
lucire.com
facebook.com
drlucire.com
spanishdict.com
twitter.com
lucirahealth.com
—How Bing has slipped. There are sites here relating to the Spanish word lucirse and to Lucira, who makes PCR tests for COVID-19. One is for Yola. 7/10
 
Qwant.com
lucire.net
wikipedia.org
spanishdict.com
drlucire.com
spanishdict.com
tumblr.com
lucirahealth.com
lacire.co
amazon.com
lucirahealth.com
—For a Bing-licensed site, this is even worse. No surprise to see lucire.com gone here, given how inconsistently Bing has treated it of late. But there are results here for Lucira and a company called La Cire. The Amazon link is also for Lucira. 3/10
 
Qwant.fr
lucire.net
wikipedia.org
reverso.net
luciremen.com
lucire.com
twitter.com
lacire.co
lucirahealth.com
viyeshierelucre.com
lucirahealth.com
—The sites change slightly if you use the search box at qwant.fr. The Reverso page is for the Spanish word luciré. Sixth through tenth are irrelevant and do not even relate to the search term. Eleventh and twelfth are for lucire.com and facebook.com, so there were more relevant pages to come. The ranking or relevant results, then, leaves something to be desired. 5/10
 
Duck Duck Go
lucire.com
lucire.net
wikipedia.org
spanishdict.com
drlucire.com
spanishdict.com
lucirahealth.com
amazon.com
lacire.co
luciremen.com
—Well, at least the Duck puts lucire.com up top, and the home page at that (even if Bing can’t). Only four relevant results, with Lucire Men coming in at tenth. 4/10
 
Brave
lucire.com
instagram.com
twitter.com
wikipedia.org
linkedin.com
lucire.net
facebook.com
fashion.net
wiktionary.org
nsw.gov.au
—For the new entrant, not a bad start. Shame about the smaller index size. All of these relate to us except the last two, one a dictionary and the other referring to Yolande Lucire. 8/10
 

The results are surprising from these first results’ pages.
 
★★★★★★★★★★ Google
★★★★★★★★★★ Yandex
★★★★★★★★★★ Startpage
★★★★★★★★★☆ Virtual Mirage
★★★★★★★★☆☆ Brave
★★★★★★★☆☆☆ Baidu
★★★★★★★☆☆☆ Bing
★★★★★★☆☆☆☆ Swisscows
★★★★★☆☆☆☆☆ Mojeek
★★★★★☆☆☆☆☆ Qwant.fr
★★★★☆☆☆☆☆☆ Duck Duck Go
★★★☆☆☆☆☆☆☆ Qwant.com
 

It doesn’t change my mind about the suitability of Mojeek for internal searches though. It’s still the one with the largest index aside from Google, and it doesn’t track you.

Tags: , , , , , , , , , , , , , , , , , , , ,
Posted in China, France, internet, publishing, technology, UK, USA | 2 Comments »


Not alone in discovering Bing is broken

24.07.2022


MIA again on Bing: Lucire’s home page. The alt tags are not missing, with perhaps some exceptions for small logos. And not having an H1 tag is not fatal to other pages of ours that have been indexed. It remains bizarre.
 
After Holly Jahangiri’s very useful feedback to my previous post, I thought I’d give the search engines she sampled a go for site:lucire.com.

Bear in mind that Duck Duck Go, Ecosia, Qwant, Onesearch and Swisscows license from Bing, and Startpage picks up Google, so their indices will reflect the mothership.

Here’s how we look today. Bing remains well and truly beaten by Google, Mojeek, Baidu and Yandex.
 
Google: 6,100
Mojeek: 3,569
Swisscows: 498
Baidu: 201
Startpage: 198
Virtual Mirage: 100
Yandex: 94
Bing: 50
Qwant: 50
Duck Duck Go: 49
Ecosia: 49
Brave: 14
Searchencrypt: 8
Searx: 0
Onesearch: blocked in New Zealand
 

I am not alone, it seems. This thread on Microsoft Answers was enlightening. Others in the thread have found themselves gone from Bing (but not Google), and Microsoft appears to know about it, admitting to some fault and escalating the issues internally, but nothing ever gets done.

I had that old theory, blogged about previously, that computer databases get worn after a while. I saw that with Vox, a lot of Facebook’s ills can be put down to it, and maybe Bing has now got there? No tech ever wants to admit it because of how crazy it sounds. But if we can lose data on hard drives and USB sticks, then I don’t care how many back-ups these big firms have, they are still fallible. (What if faults in one database are copied on to another, and the checksums weren’t verified?)

I replied to the Microsoft poster, and it’s a pretty good summary so far:

Hi EbinVThomas, here’s my experience, and I’ve run websites for three decades. The short version is I think Bing is stuffed and it’s not a Microsoft core business, so it doesn’t get much love (indeed, one of their FAQ pages has a heading about ‘seach’). I know the Microsoft fans will attack me for saying this, just as the Apple fans have a go at me when I say something negative about Macs, but I haven’t read anything to change my opinion.

We started vanishing from Bing earlier this year, maybe about three months ago. For some of our sites, I thought it was our belated switch to HTTPS for some of them, but as you’ll read, that wasn’t the case.

These sites date from (at their present domains) 1995, 1997, 2002 and 2008, and they are well linked, well respected, and one has been winning awards from 1997 to today. Google and Mojeek have no problems with any of them. Two of the sites (the 2002 and 1995 ones) did drop from their number-one and two positions on Google (for a name search) when they switched to HTTPS but one has mostly recovered, the other (from 1995, with a lot of inbound links to HTTP) fluctuates.

One of the other sites uses Duck Duck Go for its internal site search (and has done since the 2010s), which is powered by Bing. Earlier this year—say about six weeks ago—I noticed that the internal search was getting more and more useless, even though I knew the articles used to be found by DDG.

I began doing site:domain.com searches for this one. It had c. 50 entries on Bing, down from several thousand earlier in the year.

My first reaction was to blame ourselves—maybe it was the full switch to a secure server (some earlier pages were already on HTTPS), or something else. We also began using Cloudflare again after a 12-year break around this time.

I signed up to Bing Webmaster Tools. The site promptly went down to 10 entries! In other words, signing up to Tools made the site’s presence a lot, lot worse.

I found some weird site maps that I never put in, nor did any of my team. Nevertheless, I put in new, fresh ones last week, all pointing to HTTPS. Most of the pages have not been indexed.

I had to turn off Cloudflare’s IndexNow because it was sending some totally irrelevant and old pages and files to Bing. (So we can blame Cloudflare for some issues, but the majority still rests with Bing.)

Since the new site maps, Bing is now returning 53–5 entries (depending on the hour).

It finally included the home page which had been missing from the site: searches. Yet only yesterday Webmaster Tools said the page was not indexed because of certain issues, but it had been found in 2018. That made no sense as it was present until quite recently. Those issues included a description tag being too long (fine, I edited it), and no H1s (but why should there be? Not everyone wants humungous type on their page). But Bing had been fine historically with the page (since Bing started, so well before 2018) and it even appeared in the index during the last few weeks. A related page for our business doesn’t have H1s, an even longer meta description, and it’s on Bing. (It’s just not been entered into Webmaster Tools, which seems to be a kiss of death!)

Webmaster Tools even said it had accepted the site maps and the thousands of pages listed.

As far as I can make out, Webmaster Tools says one thing but reality says another.

So, was it Cloudflare and HTTPS that had knocked us? Well, no. Of the four sites I mentioned, we didn’t change the set-up of the one started in 2008. It’s a reference site, and has plenty of inbound links from Wikipedia since it’s fairly authoritative.

No Cloudflare, and still on HTTP. All fine on Google and Mojeek.

Also thousands of pages.

On Bing: 51 pages.

Thousands of entries have vanished since earlier this year, and I’m going to hazard a guess to say it began happening around the time you wrote your original post.

It has had a slight impact on our traffic, especially since we had promoted Duck Duck Go so heavily since 2010 and encouraged others to shift from Google to it.

It seems that Bing can now only cope with 50-odd pages from certain sites. The older sites have fewer pages indexed now on Bing than they did on Excite or Hotbot in the 1990s, and certainly far fewer than Altavista! Our sites are so incredibly varied—static, dynamic, HTML, PHP—so it can’t be structural or the way we have set things up. None have had issues at Google other than one that dropped in the index for a certain relevant search, and Mojeek is fine with them all and took the HTTPS shift for three of them in its stride.

These are such old sites with a history in Bing, so my feeling is that a new site won’t stand much of a chance.

This is a long way of confirming your original post: it’s not you, it’s them.

Tags: , , , , , ,
Posted in internet, technology, USA | 3 Comments »


Bing is definitely very broken, and it’s hurting Duck Duck Go

23.07.2022

The last few days have been about ‘How awesome is Mojeek?’ and ‘How shit is Bing?’

I’m finding great search results from Mojeek, and as a site search for Lucire, it’s absolutely brilliant. Blows Duck Duck Go (Bing with privacy) away, even back when DDG had a reasonably comprehensive index of our pages (before the HTTPS switch). I don’t have to subject anyone to Google tracking, and I didn’t have the hassle of installing an internal search ourselves.

Cisene, who I met via Mastodon, very helpfully suggested on that social network that I submit site maps for the Lucire website as that would take a reasonably short time to remedy Bing’s ills. I’ve never had to do them for Google or Mojeek: their spiders work as they have always done since the dawn of search engines. For some reason, Bing needs its hand held if I want it to have thousands of pages again, as it did earlier this year.

One thing I found curious with Bing is its insistence, in a site search, to place a page that we have not linked to since 2005 at the very top. Of course I could delete the page or program in a forwarder, or make a 301, but I was also once told that dead links and forwarders were bad things for search engines. Our ‘About’ page also ranks highly in all search engines, despite not being linked to in anything we’ve done in over 15 years as well.

But where’s the home page? Happily, after submitting site maps, Bing’s index of our pages went from 10 to a whopping 55, and the home page appeared for the first time in a site:lucire.com search:
 

 

‘It’s an improvement,’ I thought, though the search engine is still massively handicapped compared to where it was at the start of 2022.

Checking on Bing Webmaster Tools to see where things were, I was curious to see it claim that it could not crawl or index our home page though it was discovered in 2018:
 

 

But you just crawled and indexed it. Which is it?

The excuses this time (as Big Tech people love to make stuff that blames users) are that there are no <H1> tags (I’ve got news for you, Bing: we don’t use them, and why should we? There was never any rule that stated that headlines must be between them, and no one else seems to care) and that the description is too long (again, it was fine for you before—and actually you’ve just shown that it is fine).

They aren’t in the business of search though, as their explanations reveal. It’s seach:
 

 
Goodness knows how many years that’s been there, ignored.

It’s all so slap-dash and unprofessional, and as Duck Duck Go search results are based on Bing’s, I’m going to have to stop recommending it. Fortunately, I found Mojeek at the perfect time.

I’m also discovering that maybe Bing can no longer handle more than 50-odd pages per site anyway, which, of course, makes it useless as an engine that powers a site search. (Like I keep saying, the defunct Excite in the 1990s could do better. Any search engine from those days could spider and index more effectively.) It would be in line with other Microsoft products, such as Notepad, where the software giant now prevents us from typing £ or , except, presumably, people from the countries where those are the common, keyboard-accessible currency symbols. Want to write Cæsar drinks Nescafé? You can try, but the diphthong and é will be missing.

Today I searched site:autocade.net on Bing. Now, we never switched Autocade to HTTPS. After how all our sites fell, would you risk it? This site is dependent on search-engine traffic.

And here are the number of pages each search engine brings up for a site search.
 
Google: 4,080
Mojeek: 3,348
Bing: 51
Duck Duck Go: 50
Brave: 17 (plus 4 underneath first entry)
 

So I can’t keep blaming the switch to HTTPS, though our troubles with all search engines I knew of then began around this time. Autocade still slipped in Bing despite no down time; we went to a newer Mediawiki version, but that was about it. Everything progressed as it always did.

Google eventually allowed things to recover (for the most part) with the exception of our company website (which rose up to 13th before dropping to 26th today), Mojeek never even had an issue to begin with, but Bing and Duck Duck Go don’t link to Jack Yan & Associates’ website till after the 40th position.

So where are we now with the sites I last looked at?
 
Number of results for site:lucire.com
Google: 6,250
Mojeek: 3,563
Bing: 53
Duck Duck Go: 53
Brave: 15 (plus 4 underneath first entry)
 
Number of results for site:jackyan.com
Google: 1,860
Mojeek: 438
Duck Duck Go: 54
Bing: 43
Brave: 13 (plus 4 underneath first entry)
 
Number of results for site:jyanet.com
Google: 743
Mojeek: 296
Bing: 49
Duck Duck Go: 49
Brave: 20
 

I honestly think Bing is broken.

Just as well no one I know uses it, but quite a number of people do opt for Duck Duck Go, because of the work it’s done in promoting privacy. I still admire them for this stance. But as many of you know, it sources its results from Bing, so if one is broken, both will be. And that’s a darned shame as I almost hit 12 years of having Duck Duck Go as my default (from August 2010 or thereabouts).

All the more reason to retain Mojeek as my default search engine.

Will I bother looking any more into Bing? Probably not, but how do I convince all those I recommended Duck Duck Go to to check out Mojeek?

Tags: , , , , , , , , , ,
Posted in internet, marketing, publishing, technology, USA | 4 Comments »