Posts tagged ‘Microsoft’


Forget the 2010s and 2020s, Bing’s results are firmly in the 2000s now

09.10.2022

Immediately after blogging about Bing being able to pick up an article from 2022, Microsoft’s collapsing search engine has reverted back to being the Wayback Machine. There was just over a week of it living in the 2020s, but it seems it’s too much for them.

It’s back to, well, Bing Vista, for want of a better term. Of the 50 results (out of a claimed 120!) that it’s capable of returning for site:lucire.com, here is how it breaks down based on the publication year of the article. Since my last test, Bing has eliminated the 2018 and 2019 results (one page per year). We wouldn’t want to think it could deliver anything from the last decade, would we?
 
Bing
Contents’ pages ★★
1997
1998
1999
2000
2001 ★★★★★
2002
2003 ★★★★
2004 ★★★★
2005 ★★
2006 ★
2007 ★★★★★★★
2008 ★★
2009 ★★
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
 

There were 29 unique results, which means 21 were repeats—42 per cent! Bing says it had 120 results but really only had 29. To fill up the 50 it had to show 21 results multiple times!

Let’s see how Google fared for the first 50 results.
 
Google
Contents’ pages ★★★★★★★★★★
1997
1998
1999
2000
2001
2002 ★★
2003
2004 ★★
2005 ★
2006
2007 ★
2008
2009 ★
2010 ★★
2011 ★★★
2012 ★★
2013 ★★
2014 ★★★
2015 ★★
2016 ★★
2017 ★★
2018 ★★
2019 ★★★
2020 ★★★
2021 ★★
2022 ★★★★★
 
Google has moved again since we began looking at things. In an earlier test tonight, Google had two repeat results, which was a surprise. But I wasn’t able to replicate it when I did the one for the blog post.

No such issues at Mojeek, where every entry is unique. They really are more capable of delivering search engine results for site searches that are superior to the other two’s.
 
Mojeek
Contents’ pages ★★★★★★★★
1997
1998
1999
2000
2001
2002
2003
2004 ★
2005
2006
2007
2008
2009 ★
2010 ★★
2011 ★★
2012 ★★★
2013 ★★★★★
2014 ★★★
2015 ★★★★★
2016
2017
2018
2019 ★★★★
2020 ★★
2021 ★★★★★★★★★
2022 ★★★★★★
 
An improvement on our September 21 test, where Mojeek has managed to capture more 2020s pages as part of its top 50.

I won’t run the other search engines through this—I just wanted two points of comparison to highlight how ridiculous Bing remains, with the resultant effect on web traffic. It means Duck Duck Go, Qwant, Ecosia, Yahoo and others, which are also Bing, are just as compromised.

I might lay off them for a while as we know it’s crap and things aren’t going to change. Microsoft has firmly entrenched itself as a bunch of liars, like their other Big Tech counterparts.

Tags: , , , , , , , , , ,
Posted in internet, technology, USA | No Comments »


Bing actually indexes a page from 2022—Microsoft must be in shock

08.10.2022

This is a miracle. Bing actually indexed and showed something from Lucire from 2022. Of course, since it’s incapable of remembering what it had shown in its results earlier, the story was repeated twice on subsequent pages.

Since I began my tests earlier this year (and finding out yet another area that tech companies brag about is actually half-baked and filled with BS), this is the first story from 2022 that Bing has picked up. Who knew, Microsoft’s much-talked-about search engine actually getting to something from after 2007?
 


Tags: , , , , , ,
Posted in internet, technology, USA | No Comments »


Windows 11 22H2 arrives; now for the usual post-upgrade tweaks

02.10.2022

Windows 11 22H2 arrived for me yesterday, and the first order of business, as always, was to sort out the typography. This earlier post is roughly right: make the registry hacks, then change the properties of the fonts in C:\Windows\WinSXS (namely by giving them administrator access) before deleting them. However, I needed one extra step to get them out of C:\Windows\Fonts, and that was to boot up in safe mode and delete them from 7Zip. Only then could I change the properties and say farewell to the dreaded Arial.

You still can’t type most characters above ASCII 128 in Notepad—a crazy state of affairs introduced during Windows 11’s time—though I managed to get the pound sterling sign to work (even though there might be less need to type it now thanks to the UK government). I guess no one uses the euro symbol at Redmond, or goes to a café (forget about any accented characters).

We’ll see if Explorer still rotates photos by itself—but as I’ve replaced it with One Commander for most of my file management, it will be a while before I will find out.

The new icons look good, and the new Maps seems to work reasonably well. Mostly I just care that my usual programs are fine and Windows’ font substitutes don’t do anything silly.
 

Tags: , , , , , , ,
Posted in design, technology, typography | No Comments »


Bing hates novelty—it’s really Microsoft’s Wayback Machine

27.08.2022

Bing is still very clearly near death, as this latest site: search shows.
 

 

It manages a grand total of 10 pages from Lucire, and as outlined before, some are pages that have not been linked to for 17 years.

I purposely updated some of the pages Bing had in its limited capacity, and strangely, those have disappeared! Bing doesn’t want anything new, as it appears to be Microsoft’s Wayback Machine.

The fifth result here is a case in point. Some of you may recall lucire.com/about.shtml appearing in all the search engines, including Bing. This is a page last updated in 2004, with some final tweaks in 2012 (I assume for ad code; I don’t recall). It was a page that I decided I would stick on to a new template, since the search engines loved it so much. I copied the text from our licensing site. And, for the sake of online archæology, I put the 2004 page exactly as it was into a file called about-2004.shtml.

Bing must still be alive enough to spider and index the renamed page, but it rejects the revised about.shtml!

It’s similar to what I wrote in mid-August when I updated other ancient pages from the early 2000s: Bing rejected them, including a frameset that now pointed at the latest page!

You may be thinking: obviously, you are doing something wrong with your newer code, Jack, for Bing to favour the old stuff. But look at the fourth result: it’s from 2020, the one “new” page that Bing has managed to index and show. I don’t think we have anything wrong with our code if this page has made it in.

Google happily included the new about.shtml.

A search for Lucire itself on Bing now does include the home page, which is a new development in a search engine that’s limping along. So much for the earlier claim that there were issues with the page that prevented it from appearing.

Tags: , , , , , , , , , , , ,
Posted in internet, media, publishing, technology, USA | No Comments »


More of Bing’s follies (they just keep coming)

16.08.2022

I see WorldWideWebSize.com has wised up and figured out Bing was having them on about the number of results it had for their search terms.
 

 

When Bing says it has 300-odd results for the site:lucire.com yet doesn’t actually go beyond a limit of around 50 (where it has been stuck for many months), I was actually being generous. I never deducted the repeated results on the pages that it did show.

Here’s a case in point: an ego search for my own name. These are the first four pages. I realize I have the graphics a bit small, but you should be able to make out just how many pages have been repeated here. A regular search engine like Mojeek and Google show you different results on each page. Bing doesn’t.
 




 

More strange happenings: you’ll recall I noted that pages we haven’t linked to since the 2000s were up top in a site search on Bing for lucire.com. The very top one was lp.html, a frameset (yes, it’s that old). I did what I thought would be logical in such a circumstance: I pointed one of the frames to the current 2022 page (which is still regular HTML, but with Bootstrap).

Result in Bing: it’s vanished.

Did the same to news.html, not linked to since 2012.

Vanished.
 

 

The current news page is Wordpress, but Bing still manages to index the occasional Wordpress page on our site. The fact it’s PHP shouldn’t make a difference.

These pages are just too new for Bing, which is really Microsoft’s own Wayback Machine. And Duck Duck Go’s, and Qwant’s, and a whole manner of search engines’.
 
Meanwhile at Brave: it does have an independent spider but admits to using the Bing API for the image search, as does Mojeek. But what Brave doesn’t say is that it also taps in to Bing for site: searches, rendering them largely useless, too. Brave does a far better job than Bing in its regular search though, picking up lucire.com for Lucire as well as some major index pages.
 

On a regular search, Brave does rather well—it’s picked up the top pages.
 


Bing and Brave compared, using site:lucire.com. Brave isn’t as independent as you might think with site: and image searches. These screenshots were taken on Sunday.
 

Still well short of Mojeek in terms of its index—but then so is everyone aside from Google.

The saga continues, with still no one talking about Bing’s collapse (though I know of one journalist working away behind the scenes).

Tags: , , , , , , , , , ,
Posted in branding, business, internet, technology, USA | No Comments »


IndexNow is a crock

13.08.2022


 
Just trying to clear a few things off my hard drive. Here was one that was particularly curious when I was investigating what was going on with Bing: the files submitted by Cloudflare’s IndexNow. The theory: it would send Bing the newest accessed pages to add to the index. The reality: these are not new. In fact, these are ancient, many aren’t even web pages (they’re PDFs and web fonts). And sure enough, some did make it into the 10–55 pages that Bing is capable of indexing for Lucire these days—it’s a very tiny index in reality, regardless of how many results it claims to have for a given search, as we discovered.

In other words, IndexNow, as I saw it implemented, is a total crock, and not worth the bother.

I wish these companies would test these things first, but we are talking Microsoft, where we’ve been doing the job as unpaid QA for decades.
 
It does get worse. Looking inside Bing Webmaster Tools, these (below) are the pages it says it has for Lucire’s root directory. I’ve alluded to how bad it was earlier, but upon going through these, the main index pages, which Bing always had till recently, are missing. The home page is also missing (although when I first started investigating in July, it was still there, which a friend can confirm; and the structure of it has not changed other than the removal of some links to 404s). All that’s left are pages from the early 2000s, plus entries for pages that have never existed. You can check these against the Wayback Machine, but we have never had pages in the main directory called nguoi-noi-tieng, arts-culture, podcast, form-single.html, archivi or cv-generator. Yet Bing believes these phantom pages exist. Well done, Microsoft, you can’t even get this right. This isn’t how spidering works.
 

Tags: , , , , ,
Posted in internet, technology | No Comments »


Mystery sitemap files in Bing

12.08.2022

I only signed up to Bing Webmaster Tools when investigating why the company site did so poorly in Bing and Duck Duck Go—we now know it was nothing to do with us, and everything to do with a search engine basically disintegrating before our very eyes.

This, too, was interesting, from a screenshot dated July 20, 2022. I never added these sitemaps, and they all pre-date when I signed up to Webmaster Tools. They were all there when I went to the tools for lucire.com. They are not RSS feeds we’ve ever sanctioned, though of course someone could have created them intentionally to follow a subject. Maybe someone at Microsoft?
 

 

You may notice the number of pages: 51. These 51, however, have no real bearing on the 50-odd that Bing can display before it craps out.

I’ve since added sitemaps for the rest of the site, to no avail, natch.

Anyone else find weird sitemap files in their account after signing up?

Tags: , , , ,
Posted in internet, technology | No Comments »


What search engines show in their top 10 isn’t always relevant

09.08.2022

The Bing collapse did lead me to look at some of the ancient pages on the Lucire site that the search engines were still very fond of. For instance, the ‘About’ page was still appearing up top, which is bizarre since we haven’t made any links to it for years—it reflected our history in 2004.

Naturally, once I updated it, it promptly disappeared from Bing! Too new for Microsoft’s own Wayback Machine!

I was always told that you shouldn’t delete old pages, and that 301s were the best solution. I’m enough of a computing neophyte to not know how to implement 301s (.htaccess doesn’t work, at least not on our set-up) and page refreshes are often frowned upon, which is why so many old pages are still there.

However, you would naturally expect that a web spider following links would not rank anything that hasn’t been linked to for over a decade very highly. If the spider comes in, picks up the latest stuff from your home page, possibly the latest stuff from individual topic pages, it would figure out what all of these were linking to, and conclude that something from 2000 that was buried deep within the site was no longer current, or of only passing interest to surfers.

I realize I’ve had a go at search engines for burying relevant things in favour of novel things, but we’re talking pages here that aren’t even relevant. ‘About’ I’ll let them have, but a 2000 book reviews’ page? A subject index page from 2005 that hasn’t been linked to since 2005, and the pages that do are well outnumbered by newer ones? Because, the deletion of ‘About’ aside, here is what Bing thinks is the most important for site:lucire.com:
 

 

Google fares a little better. Our home page and current print edition ordering page are top, shopping is third, followed by the fashion contents’ page (makes sense). ‘About’ comes in fifth, for whatever reason, then a 2005 competition page that we should probably delete (it refreshes to another page from 2005—so much for refresh pages being bad for search engines).

Seventh is yet another ancient page from 2005, namely a frameset—which I’ve since updated so at least the main frame loads something current. The remainder are articles from 2011, 2022 and 2016. The next page comprises articles and tags, which seem to make sense.

Mojeek actually makes more sense than Google. Home page in first, the news page (the next most-updated) is second, followed by the travel contents’ page. Then there are two older print edition pages (2020 and 2012), followed by a bunch of articles (2013, 2014, 2013, 2013), and the directory page for Lucire TV. There’s nothing here that I find strange: everything is logically found by a spider going through the site, and maybe those four articles from the 2010s are relevant to the word Lucire (given that you can’t do site: searches on Mojeek without a keyword, so it repeats the word before the TLD)? The reference to the 2012 issue might be down to my having mentioned it recently during our 25th anniversary posts. But there are no refresh pages and no framesets.

Startpage, not Google, has a couple of frameset pages from 2000 and 2002 in their top 10 which again weren’t linked to, at least not purposefully (they were placed there to catch people trying to look at the directory index in the old days). There’s incredibly little “link juice” to these pages. However, ‘About’ (in 10th), and these two framesets aside, its Google-sourced results fare remarkably well. In order: home page, print edition ordering page, the two framesets, the news section, the shopping page (barely updated but I can see why it’s there), the community page, Lucire TV, the fashion contents, ‘About’.

Duck Duck Go is so compromised by Bing that it barely merits a mention here. Four pages from 2000 and 2005 that no current page links, a 404 page that we’ve never even had on our site (!), articles from 2021, 2018, 2007 and 2000 (in that order), and a PDF (!) from 2004. Fancy having a 404 that never even existed in the top 10!

If I had my way, it’d be home page, followed by the different sections’ contents’ pages, then the most popular article—though if a couple of articles go (or went) viral, then I’d expect them sooner.

Both Mojeek and Google do well here, with four of these pages each in their top 10s. But it’s Startpage’s unfiltered Google results that do best, hitting linked, relevant pages in seven results out of the top 10. Bing and its licensees miss the mark completely. If you must have a Google bias, then Startpage is the way to go; for our purposes, Mojeek remains the better option.
 
★★★★★★★☆☆☆ Startpage
★★★★☆☆☆☆☆☆ Mojeek
★★★★☆☆☆☆☆☆ Google
★★☆☆☆☆☆☆☆☆ Virtual Mirage
★☆☆☆☆☆☆☆☆☆ Baidu
★☆☆☆☆☆☆☆☆☆ Yandex
☆☆☆☆☆☆☆☆☆☆ Bing
☆☆☆☆☆☆☆☆☆☆ Qwant
☆☆☆☆☆☆☆☆☆☆ Swisscows
☆☆☆☆☆☆☆☆☆☆ Brave
☆☆☆☆☆☆☆☆☆☆ Duck Duck Go (would give –1 for the 404 if I could)

Tags: , , , , , , , , , , , ,
Posted in France, internet, New Zealand, publishing, technology, UK, USA | No Comments »


Attempting re-entry into Bing’s Pubhub

08.08.2022

In early July, I wanted to see if we could add Lucire to Bing as a news source in their Pubhub—after all, Google has us as one, as Yahoo, Altavista and Excite had back in the day. And I’d say that 25 years of publishing with an international team might qualify us as being media.

The folks came back rejecting us, saying we needed to come back in a month’s time. Usual story: look at our rules, you must have messed up.

Bing tells everyone this these days, because it’s a good way to keep webmasters confounded as they try to figure out what’s wrong with their site and why they can’t get it listed. It’s the same with Pubhub.

The one “rule” that might be very broadly interpreted in their favour was that articles needed to have bylines. Granted, a lot of news ones don’t, since sometimes we don’t want credit for them, and you don’t always see a reporter’s name for shorter, simpler items. But features do have bylines. And when Bing swung round in early July, coincidentally I had written quite a lot of the last bunch of articles, so my name was all over them. That was a no-no.

So here we are, a month and a few days on. The home page (the one that Bing declines to include in their index now, as it prefers pages from the early 2000s that we haven’t linked to for over 17 years) contains articles from me, Stanley Moss, Lola Cristall, Jody Miller, and Elyse Glickman. There’s one story on Panos Papadopoulos that he wrote in the first person.

What’s the bet that nothing will happen?

Sometimes you have to give it a go, even when you know nothing will happen—just to prove a point.
 

Above: The top pages in a site:lucire.com search on Bing. Five of these pages we haven’t linked to in 17 years. As a search engine, it makes absolutely no sense.
 
I was surprised, however, that Bing claims to have 330 results for site:lucire.com today, up from 10. It’s still a tenth of what Mojeek has, and a twentieth of what Google has. But it is an improvement. Maybe the worst is over?

It’s still useless as a general search though, and even more useless as an internal search. The fact that popular pages are excluded and 17-year-old ones aren’t means something remains very wrong with the search engine.
 
PS. (August 9 NZST): I spoke too soon. Bing says 330 results, but try looking beyond 50, which was what it tended to cap Lucire at.
 

Tags: , , , , , , , , ,
Posted in internet, media, publishing, technology | No Comments »


Mojeek shows more in its search results than Google

02.08.2022

This was something I had forgotten when doing the numbers on how many pages each search engine had indexed from our sites: what they claim to be their index size and what they let you access are two different things.

And in Lucire’s case, Google, curiously, mostly does not allow access to our dynamic pages in PHP in its main index, reserving them for Google News. Google News, however, has both PHP and HTML. It’s only when you feed in a specific request for one of our stories that we know is on a PHP-generated page that it comes up in the main index’s results.

Let me explain. Remember this from a blog post in July? These are what the search engines said they had indexed for lucire.com (in a site:lucire.com search). I’ve updated it for August 2 and added one more search engine, Yep, another independent, out of interest.
 
Google: 10,600
Mojeek: 3,593
Duck Duck Go: 50
Brave: 19
Bing: 10
Yep: 10
 

But can you see 10,600? Here’s the reality of what is truly visible at the moment when you browse the results’ pages of each search engine as of today:
 
Google: 304
Mojeek: 1,000
Duck Duck Go: 50
Brave: 19
Bing: 10
Yep: 10
 


Above: Google (top) shows fewer pages than Mojeek in a site: search.
 

Mojeek maxes out at 1,000 by design, but like Google, it will find a specific article outside of the 1,000 shown if searched for. Google conks out at 304 (303 when I first did this test).

The bigger Google index is its advantage, but Mojeek does a fine job by sharing more in its results’ pages than Google does—over three times as many. Another win for the plucky independent out of the UK.
 
While we’re on the subject, notice how small the Bing index is getting, returning just 10 pages for lucire.com? It’s really collapsed in a big way. Feeding in the other sites I tested earlier, Bing shows declines all round, apart from Travel & Leisure.

Fancy having only 2,723 results from The New York Times, down from 1,190,000 on the 24th ult. Mojeek has over 1,000 times more than Bing, and Google over 12 times more than that.

Previous numbers in parentheses below.
 
Die Zeit
Google: 2,710,000 (2,600,000)
Mojeek: 4,891 (4,796)
Bing: 3,268 (3,770)
 
Annabelle (Switzerland)
Google: 11,900 (11,700)
Mojeek: 408 (405)
Bing: 26 (105)
 
Holly Jahangiri
Google: 618 (738)
Mojeek: 236 (222)
Bing: 10 (49)
 
The Gloss (Ireland)
Google: 17,600 (19,200)
Mojeek: 2,009 (1,968)
Bing: 20 (71)
 
The New York Times
Google: 36,500,000 (36,200,000)
Mojeek: 2,879,513 (2,823,329)
Bing: 2,723 (1,190,000)
 
Lucire
Google: 10,600 (6,050)
Mojeek: 3,593 (3,572)
Bing: 10 (50)
 
The Rake
Google: 11,100 (11,500)
Mojeek: 1,445 (1,443)
Bing: 16, but claims 4! (49)
 

 
Travel & Leisure
Google: 33,500 (28,100)
Mojeek: 10,081 (9,750)
Bing: 383 (220)
 
Microsoft
Google: 118,000,000 (122,000,000)
Bing: 1,927,118 (14,200,000)
Mojeek: 1,772,165 (1,748,199)
 
Detective Marketing
Google: 961 (998)
Mojeek: 579 (579)
Bing: 16 (51)

Tags: , , , , , , , ,
Posted in internet, technology, UK, USA | No Comments »