Share this page
Quick links
Add feed
|
|
The Persuader
My personal blog, started in 2006. No paid or guest posts, no link sales.
Archive for the ‘technology’ category
16.08.2022
I see WorldWideWebSize.com has wised up and figured out Bing was having them on about the number of results it had for their search terms.

When Bing says it has 300-odd results for the site:lucire.com yet doesnât actually go beyond a limit of around 50 (where it has been stuck for many months), I was actually being generous. I never deducted the repeated results on the pages that it did show.
Hereâs a case in point: an ego search for my own name. These are the first four pages. I realize I have the graphics a bit small, but you should be able to make out just how many pages have been repeated here. A regular search engine like Mojeek and Google show you different results on each page. Bing doesnât.




More strange happenings: youâll recall I noted that pages we havenât linked to since the 2000s were up top in a site search on Bing for lucire.com. The very top one was lp.html, a frameset (yes, itâs that old). I did what I thought would be logical in such a circumstance: I pointed one of the frames to the current 2022 page (which is still regular HTML, but with Bootstrap).
Result in Bing: itâs vanished.
Did the same to news.html, not linked to since 2012.
Vanished.

The current news page is Wordpress, but Bing still manages to index the occasional Wordpress page on our site. The fact it’s PHP shouldn’t make a difference.
These pages are just too new for Bing, which is really Microsoftâs own Wayback Machine. And Duck Duck Goâs, and Qwantâs, and a whole manner of search enginesâ.
Meanwhile at Brave: it does have an independent spider but admits to using the Bing API for the image search, as does Mojeek. But what Brave doesn’t say is that it also taps in to Bing for site: searches, rendering them largely useless, too. Brave does a far better job than Bing in its regular search though, picking up lucire.com for Lucire as well as some major index pages.

On a regular search, Brave does rather wellâit’s picked up the top pages.


Bing and Brave compared, using site:lucire.com. Brave isn’t as independent as you might think with site: and image searches. These screenshots were taken on Sunday.
Still well short of Mojeek in terms of its indexâbut then so is everyone aside from Google.
The saga continues, with still no one talking about Bing’s collapse (though I know of one journalist working away behind the scenes).
Tags: 2000s, 2005, 2012, 2022, Bing, Brave, history, Microsoft, search engines, technology, USA Posted in branding, business, internet, technology, USA | No Comments »
13.08.2022

Just trying to clear a few things off my hard drive. Here was one that was particularly curious when I was investigating what was going on with Bing: the files submitted by Cloudflareâs IndexNow. The theory: it would send Bing the newest accessed pages to add to the index. The reality: these are not new. In fact, these are ancient, many arenât even web pages (theyâre PDFs and web fonts). And sure enough, some did make it into the 10â55 pages that Bing is capable of indexing for Lucire these daysâitâs a very tiny index in reality, regardless of how many results it claims to have for a given search, as we discovered.
In other words, IndexNow, as I saw it implemented, is a total crock, and not worth the bother.
I wish these companies would test these things first, but we are talking Microsoft, where we’ve been doing the job as unpaid QA for decades.
It does get worse. Looking inside Bing Webmaster Tools, these (below) are the pages it says it has for Lucireâs root directory. I’ve alluded to how bad it was earlier, but upon going through these, the main index pages, which Bing always had till recently, are missing. The home page is also missing (although when I first started investigating in July, it was still there, which a friend can confirm; and the structure of it has not changed other than the removal of some links to 404s). All that’s left are pages from the early 2000s, plus entries for pages that have never existed. You can check these against the Wayback Machine, but we have never had pages in the main directory called nguoi-noi-tieng, arts-culture, podcast, form-single.html, archivi or cv-generator. Yet Bing believes these phantom pages exist. Well done, Microsoft, you can’t even get this right. This isn’t how spidering works.

Tags: 2022, bug, Cloudflare, internet, Microsoft, technology Posted in internet, technology | No Comments »
12.08.2022
I only signed up to Bing Webmaster Tools when investigating why the company site did so poorly in Bing and Duck Duck Goâwe now know it was nothing to do with us, and everything to do with a search engine basically disintegrating before our very eyes.
This, too, was interesting, from a screenshot dated July 20, 2022. I never added these sitemaps, and they all pre-date when I signed up to Webmaster Tools. They were all there when I went to the tools for lucire.com. They are not RSS feeds weâve ever sanctioned, though of course someone could have created them intentionally to follow a subject. Maybe someone at Microsoft?

You may notice the number of pages: 51. These 51, however, have no real bearing on the 50-odd that Bing can display before it craps out.
Iâve since added sitemaps for the rest of the site, to no avail, natch.
Anyone else find weird sitemap files in their account after signing up?
Tags: 2022, Bing, Microsoft, search engine, technology Posted in internet, technology | No Comments »
12.08.2022
With all the odd results coming up in site searchesâitâs not restricted to BingâI attended to some of the older pages on our websites.
Curiously, in a site:lucire.com search, even Google has our 2005 competition page up high, namely in fifth. There is only one link from our site internally to this page. I know of none externally. The idea about Backrub and âlink juiceâ doesnât ring true here as there is no way that page should be ranked so highly.


Top: Google has our 2005 competition page ranked very highly despite it being a redirect. Above: Internally, only one file refers to it, dating from the 2000s.
Not only that, itâs a page that refreshes to another on the siteâso much for these being lowly ranked and that search engines donât like them.
Nevertheless, as itâs not relevant or useful any more, I deleted it (though it remains in Google at the time of writing).
The âAboutâ page Iâve discussed before and it remains in fourth, despite not being linked from anywhere recent on our site. It was updated with text from our licensing website and now also follows the rest of the siteâthough we haven’t bothered making any new links to it. It’s really just for the search engines. (For nostalgia’s sake, it has a link to the 2004 page that the search engines love so much.)
We had so many frameset pages on the Lucire site that I updated a few of those, thoughârightly or wronglyâI left the frames intact. Well, if they rank so highly, contrary to what the experts all say, then why not?
The one that had the most surgery, however, was jyanet.com/lucire, Lucireâs original URL in 1997. That still comes up in 23rd for me in Google (for the search Lucire), and 20th in Startpage. This hasnât been linked to since 1998 by us, and I doubt very many outside of our company would. It was our home only for about six months after launch.
Given its enduring popularity, weâve given it a Bootstrap template and it shares a stylesheet with the rest of the Lucire site, despite it being at another domain. It now contains links to other Lucire sites, which seems a fitting âgiftâ to the page as we celebrate our 25th anniversary.

Tags: 1997, 2000s, 2022, Google, JY&A Media, Lucire, publishing, search engine, search engines, Startpage, technology Posted in business, internet, New Zealand, publishing, technology, USA | No Comments »
09.08.2022
The Bing collapse did lead me to look at some of the ancient pages on the Lucire site that the search engines were still very fond of. For instance, the âAboutâ page was still appearing up top, which is bizarre since we havenât made any links to it for yearsâit reflected our history in 2004.
Naturally, once I updated it, it promptly disappeared from Bing! Too new for Microsoftâs own Wayback Machine!
I was always told that you shouldnât delete old pages, and that 301s were the best solution. Iâm enough of a computing neophyte to not know how to implement 301s (.htaccess doesnât work, at least not on our set-up) and page refreshes are often frowned upon, which is why so many old pages are still there.
However, you would naturally expect that a web spider following links would not rank anything that hasnât been linked to for over a decade very highly. If the spider comes in, picks up the latest stuff from your home page, possibly the latest stuff from individual topic pages, it would figure out what all of these were linking to, and conclude that something from 2000 that was buried deep within the site was no longer current, or of only passing interest to surfers.
I realize Iâve had a go at search engines for burying relevant things in favour of novel things, but weâre talking pages here that arenât even relevant. âAboutâ Iâll let them have, but a 2000 book reviewsâ page? A subject index page from 2005 that hasnât been linked to since 2005, and the pages that do are well outnumbered by newer ones? Because, the deletion of âAboutâ aside, here is what Bing thinks is the most important for site:lucire.com:

Google fares a little better. Our home page and current print edition ordering page are top, shopping is third, followed by the fashion contentsâ page (makes sense). âAboutâ comes in fifth, for whatever reason, then a 2005 competition page that we should probably delete (it refreshes to another page from 2005âso much for refresh pages being bad for search engines).
Seventh is yet another ancient page from 2005, namely a framesetâwhich Iâve since updated so at least the main frame loads something current. The remainder are articles from 2011, 2022 and 2016. The next page comprises articles and tags, which seem to make sense.
Mojeek actually makes more sense than Google. Home page in first, the news page (the next most-updated) is second, followed by the travel contentsâ page. Then there are two older print edition pages (2020 and 2012), followed by a bunch of articles (2013, 2014, 2013, 2013), and the directory page for Lucire TV. Thereâs nothing here that I find strange: everything is logically found by a spider going through the site, and maybe those four articles from the 2010s are relevant to the word Lucire (given that you canât do site: searches on Mojeek without a keyword, so it repeats the word before the TLD)? The reference to the 2012 issue might be down to my having mentioned it recently during our 25th anniversary posts. But there are no refresh pages and no framesets.
Startpage, not Google, has a couple of frameset pages from 2000 and 2002 in their top 10 which again werenât linked to, at least not purposefully (they were placed there to catch people trying to look at the directory index in the old days). Thereâs incredibly little âlink juiceâ to these pages. However, âAboutâ (in 10th), and these two framesets aside, its Google-sourced results fare remarkably well. In order: home page, print edition ordering page, the two framesets, the news section, the shopping page (barely updated but I can see why itâs there), the community page, Lucire TV, the fashion contents, âAboutâ.
Duck Duck Go is so compromised by Bing that it barely merits a mention here. Four pages from 2000 and 2005 that no current page links, a 404 page that weâve never even had on our site (!), articles from 2021, 2018, 2007 and 2000 (in that order), and a PDF (!) from 2004. Fancy having a 404 that never even existed in the top 10!
If I had my way, itâd be home page, followed by the different sectionsâ contentsâ pages, then the most popular articleâthough if a couple of articles go (or went) viral, then Iâd expect them sooner.
Both Mojeek and Google do well here, with four of these pages each in their top 10s. But itâs Startpageâs unfiltered Google results that do best, hitting linked, relevant pages in seven results out of the top 10. Bing and its licensees miss the mark completely. If you must have a Google bias, then Startpage is the way to go; for our purposes, Mojeek remains the better option.
â
â
â
â
â
â
â
âââ Startpage
â
â
â
â
ââââââ Mojeek
â
â
â
â
ââââââ Google
â
â
ââââââââ Virtual Mirage
â
âââââââââ Baidu
â
âââââââââ Yandex
ââââââââââ Bing
ââââââââââ Qwant
ââââââââââ Swisscows
ââââââââââ Brave
ââââââââââ Duck Duck Go (would give â1 for the 404 if I could)
Tags: 2000s, 2010s, 2020s, 2022, Bing, Duck Duck Go, Google, JY&A Media, Lucire, Microsoft, Mojeek, publishing, search engines Posted in France, internet, New Zealand, publishing, technology, UK, USA | No Comments »
08.08.2022
In early July, I wanted to see if we could add Lucire to Bing as a news source in their Pubhubâafter all, Google has us as one, as Yahoo, Altavista and Excite had back in the day. And Iâd say that 25 years of publishing with an international team might qualify us as being media.
The folks came back rejecting us, saying we needed to come back in a monthâs time. Usual story: look at our rules, you must have messed up.
Bing tells everyone this these days, because itâs a good way to keep webmasters confounded as they try to figure out whatâs wrong with their site and why they canât get it listed. Itâs the same with Pubhub.
The one âruleâ that might be very broadly interpreted in their favour was that articles needed to have bylines. Granted, a lot of news ones donât, since sometimes we donât want credit for them, and you donât always see a reporterâs name for shorter, simpler items. But features do have bylines. And when Bing swung round in early July, coincidentally I had written quite a lot of the last bunch of articles, so my name was all over them. That was a no-no.
So here we are, a month and a few days on. The home page (the one that Bing declines to include in their index now, as it prefers pages from the early 2000s that we havenât linked to for over 17 years) contains articles from me, Stanley Moss, Lola Cristall, Jody Miller, and Elyse Glickman. Thereâs one story on Panos Papadopoulos that he wrote in the first person.
Whatâs the bet that nothing will happen?
Sometimes you have to give it a go, even when you know nothing will happenâjust to prove a point.

Above: The top pages in a site:lucire.com search on Bing. Five of these pages we haven’t linked to in 17 years. As a search engine, it makes absolutely no sense.
I was surprised, however, that Bing claims to have 330 results for site:lucire.com today, up from 10. It’s still a tenth of what Mojeek has, and a twentieth of what Google has. But it is an improvement. Maybe the worst is over?
It’s still useless as a general search though, and even more useless as an internal search. The fact that popular pages are excluded and 17-year-old ones aren’t means something remains very wrong with the search engine.
PS. (August 9 NZST): I spoke too soon. Bing says 330 results, but try looking beyond 50, which was what it tended to cap Lucire at.

Tags: 2022, Bing, JY&A Media, Lucire, media, Microsoft, news, publishing, search engine, technology Posted in internet, media, publishing, technology | No Comments »
07.08.2022
Ever since we had to reset the counter for Autocade in March, because of a new server and a new version of Mediawiki, itâs been interesting to see which pages are most popular.
The old ranking took into account everything from March 2008 to March 2022. With everything set to zero again, I can now see whatâs been most popular in the last few months.
Some of the top 20 were among the top pages before March 2022, but whatâs surprising is whatâs shot up into the top slots.
Over the course of half a day on Friday GMT, the Toyota Corolla (E210) page found itself as the top page, home page excepting. And the Kia Morning (TA) page shot up out of nowhere recently, too.
I know our page on the Corolla is number one on Mojeek for a search of that model but that canât be the only reason itâs done so well. I havenât studied the referrer data. A shame that link: no longer works on search engines.

Corolla fans, thank you for your extra 6,000 page views! Itâs helped our overall total, but the viewing rate is still down at 2019 levels thanks to the collapse of the Bing index, and the search engines that itâs taken down with them.
I almost feel Iâve shot myself in the foot for promoting Duck Duck Go so much since 2010! But then I hopefully spared a lot of people from being tracked (as much) by the big G.
Tags: 2022, Autocade, Bing, JY&A Media, Mediawiki, Mojeek, publishing, statistics, Toyota Posted in cars, interests, internet, media, New Zealand, publishing, technology | No Comments »
05.08.2022

Above: Some French text in Lucire.
Regular Lucire readers will have seen a number of articles run in English and French (and one in Japanese) on our main website. Typographically, the French ones are tricky, since we have to distinguish between non-breaking spaces and non-breaking thin spaces, and as far as I know, there is no code for the latter in HTML. Indeed, even with a non-breaking space, a browser can treat it as it would a regular space.
So whatâs our solution? Manually, and laboriously, putting in <NOBR> tags around the words that cannot be broken. Itâs not efficient but typographically, it makes the text look right and, unless weâve missed one, we donât have the problem of guillemets being left on a line by themselves without a word to attach to.
The language is set to fr in the meta tags.
Among our French colleagues, I have seen some go Anglo with their quotation marks and ignoring the traditional French guillemets. Others omit any thin spaces and, consequently, adopt the English spacing rules with punctuation. For some reason, I just canât bring ourselves to do it, and maybe there is an easier way that we haven’t heard of. I hope nos lecteurs français appreciate the extra effort.
Tags: 2020s, 2021, 2022, French, JY&A Media, Lucire, programming, publishing, typography Posted in business, design, interests, internet, media, publishing, technology, USA, Wellington | No Comments »
02.08.2022
This was something I had forgotten when doing the numbers on how many pages each search engine had indexed from our sites: what they claim to be their index size and what they let you access are two different things.
And in Lucireâs case, Google, curiously, mostly does not allow access to our dynamic pages in PHP in its main index, reserving them for Google News. Google News, however, has both PHP and HTML. Itâs only when you feed in a specific request for one of our stories that we know is on a PHP-generated page that it comes up in the main indexâs results.
Let me explain. Remember this from a blog post in July? These are what the search engines said they had indexed for lucire.com (in a site:lucire.com search). Iâve updated it for August 2 and added one more search engine, Yep, another independent, out of interest.
Google: 10,600
Mojeek: 3,593
Duck Duck Go: 50
Brave: 19
Bing: 10
Yep: 10
But can you see 10,600? Hereâs the reality of what is truly visible at the moment when you browse the resultsâ pages of each search engine as of today:
Google: 304
Mojeek: 1,000
Duck Duck Go: 50
Brave: 19
Bing: 10
Yep: 10


Above: Google (top) shows fewer pages than Mojeek in a site: search.
Mojeek maxes out at 1,000 by design, but like Google, it will find a specific article outside of the 1,000 shown if searched for. Google conks out at 304 (303 when I first did this test).
The bigger Google index is its advantage, but Mojeek does a fine job by sharing more in its resultsâ pages than Google doesâover three times as many. Another win for the plucky independent out of the UK.
While we’re on the subject, notice how small the Bing index is getting, returning just 10 pages for lucire.com? It’s really collapsed in a big way. Feeding in the other sites I tested earlier, Bing shows declines all round, apart from Travel & Leisure.
Fancy having only 2,723 results from The New York Times, down from 1,190,000 on the 24th ult. Mojeek has over 1,000 times more than Bing, and Google over 12 times more than that.
Previous numbers in parentheses below.
Die Zeit
Google: 2,710,000 (2,600,000)
Mojeek: 4,891 (4,796)
Bing: 3,268 (3,770)
Annabelle (Switzerland)
Google: 11,900 (11,700)
Mojeek: 408 (405)
Bing: 26 (105)
Holly Jahangiri
Google: 618 (738)
Mojeek: 236 (222)
Bing: 10 (49)
The Gloss (Ireland)
Google: 17,600 (19,200)
Mojeek: 2,009 (1,968)
Bing: 20 (71)
The New York Times
Google: 36,500,000 (36,200,000)
Mojeek: 2,879,513 (2,823,329)
Bing: 2,723 (1,190,000)
Lucire
Google: 10,600 (6,050)
Mojeek: 3,593 (3,572)
Bing: 10 (50)
The Rake
Google: 11,100 (11,500)
Mojeek: 1,445 (1,443)
Bing: 16, but claims 4! (49)

Travel & Leisure
Google: 33,500 (28,100)
Mojeek: 10,081 (9,750)
Bing: 383 (220)
Microsoft
Google: 118,000,000 (122,000,000)
Bing: 1,927,118 (14,200,000)
Mojeek: 1,772,165 (1,748,199)
Detective Marketing
Google: 961 (998)
Mojeek: 579 (579)
Bing: 16 (51)
Tags: 2022, Bing, Duck Duck Go, Google, Microsoft, Mojeek, publishing, search engines, technology Posted in internet, technology, UK, USA | No Comments »
24.07.2022
Well, folks, hereâs someone whoâs done the maths. The stats in the last post suggested as much but the sample was so small.
Maurice de Kunder at WorldWideWebSize.com has a definitive graph:

His methodology is explained at his site.
Iâd say late May or early June was when I noticed Duck Duck Go queries on Lucire become largely useless. After a month of seeing no improvement, I began looking into alternatives.
No one knows why, since Bingâs not going to admit any of this. If I was Duck Duck Go, I’d be looking into alternatives smartly. Anyone want to get in touch with Alltheweb and Inktomi? Their indices in the early 2000s were bigger than this.
PS.: I tried to tell the SEO sub-Reddit, but no joy. It was immediately removed.

The original text:
Since June I noticed that our internal site:domain.com searches powered by Duck Duck Go were not returning many results any more. As DDG is powered by Bing, I checked it out there, and, sure enough, we dipped from thousands of entries to 50 (and even 10 at one point). This is a 25-year-old site with decent inbound links.
I did a lot of investigating which I wrote up on my own blog (which I wonât link here due to sub-Reddit rules) and came across this website, which seems to suggest Bing has tanked. The person who runs it is pretty clued up on statistics.
I have run a small sample of 10 sites through the search engines as well and these back up their findings.
At this rate, Bing is smaller than Inktomi and Alltheweb in the early 2000s. What strikes me as weird is that all the Bing licensees havenât done anything, either, so Duck Duck Go, Ecosia, Qwant, and Onesearch have all shrunk, too. (Swisscows is still reasonably sized.)
Anyone else been through something similar in the last two months?
Why don’t they wish to know? I would have thought this was rather serious for an SEO group.
Tags: 2022, Bing, internet, Microsoft, Netherlands, Reddit, research, search engine, technology, the Netherlands, World Wide Web Posted in internet, technology | 4 Comments »
|