What search engines show in their top 10 isn’t always relevant


The Bing collapse did lead me to look at some of the ancient pages on the Lucire site that the search engines were still very fond of. For instance, the ‘About’ page was still appearing up top, which is bizarre since we haven’t made any links to it for years—it reflected our history in 2004.

Naturally, once I updated it, it promptly disappeared from Bing! Too new for Microsoft’s own Wayback Machine!

I was always told that you shouldn’t delete old pages, and that 301s were the best solution. I’m enough of a computing neophyte to not know how to implement 301s (.htaccess doesn’t work, at least not on our set-up) and page refreshes are often frowned upon, which is why so many old pages are still there.

However, you would naturally expect that a web spider following links would not rank anything that hasn’t been linked to for over a decade very highly. If the spider comes in, picks up the latest stuff from your home page, possibly the latest stuff from individual topic pages, it would figure out what all of these were linking to, and conclude that something from 2000 that was buried deep within the site was no longer current, or of only passing interest to surfers.

I realize I’ve had a go at search engines for burying relevant things in favour of novel things, but we’re talking pages here that aren’t even relevant. ‘About’ I’ll let them have, but a 2000 book reviews’ page? A subject index page from 2005 that hasn’t been linked to since 2005, and the pages that do are well outnumbered by newer ones? Because, the deletion of ‘About’ aside, here is what Bing thinks is the most important for


Google fares a little better. Our home page and current print edition ordering page are top, shopping is third, followed by the fashion contents’ page (makes sense). ‘About’ comes in fifth, for whatever reason, then a 2005 competition page that we should probably delete (it refreshes to another page from 2005—so much for refresh pages being bad for search engines).

Seventh is yet another ancient page from 2005, namely a frameset—which I’ve since updated so at least the main frame loads something current. The remainder are articles from 2011, 2022 and 2016. The next page comprises articles and tags, which seem to make sense.

Mojeek actually makes more sense than Google. Home page in first, the news page (the next most-updated) is second, followed by the travel contents’ page. Then there are two older print edition pages (2020 and 2012), followed by a bunch of articles (2013, 2014, 2013, 2013), and the directory page for Lucire TV. There’s nothing here that I find strange: everything is logically found by a spider going through the site, and maybe those four articles from the 2010s are relevant to the word Lucire (given that you can’t do site: searches on Mojeek without a keyword, so it repeats the word before the TLD)? The reference to the 2012 issue might be down to my having mentioned it recently during our 25th anniversary posts. But there are no refresh pages and no framesets.

Startpage, not Google, has a couple of frameset pages from 2000 and 2002 in their top 10 which again weren’t linked to, at least not purposefully (they were placed there to catch people trying to look at the directory index in the old days). There’s incredibly little “link juice” to these pages. However, ‘About’ (in 10th), and these two framesets aside, its Google-sourced results fare remarkably well. In order: home page, print edition ordering page, the two framesets, the news section, the shopping page (barely updated but I can see why it’s there), the community page, Lucire TV, the fashion contents, ‘About’.

Duck Duck Go is so compromised by Bing that it barely merits a mention here. Four pages from 2000 and 2005 that no current page links, a 404 page that we’ve never even had on our site (!), articles from 2021, 2018, 2007 and 2000 (in that order), and a PDF (!) from 2004. Fancy having a 404 that never even existed in the top 10!

If I had my way, it’d be home page, followed by the different sections’ contents’ pages, then the most popular article—though if a couple of articles go (or went) viral, then I’d expect them sooner.

Both Mojeek and Google do well here, with four of these pages each in their top 10s. But it’s Startpage’s unfiltered Google results that do best, hitting linked, relevant pages in seven results out of the top 10. Bing and its licensees miss the mark completely. If you must have a Google bias, then Startpage is the way to go; for our purposes, Mojeek remains the better option.
★★★★★★★☆☆☆ Startpage
★★★★☆☆☆☆☆☆ Mojeek
★★★★☆☆☆☆☆☆ Google
★★☆☆☆☆☆☆☆☆ Virtual Mirage
★☆☆☆☆☆☆☆☆☆ Baidu
★☆☆☆☆☆☆☆☆☆ Yandex
☆☆☆☆☆☆☆☆☆☆ Bing
☆☆☆☆☆☆☆☆☆☆ Qwant
☆☆☆☆☆☆☆☆☆☆ Swisscows
☆☆☆☆☆☆☆☆☆☆ Brave
☆☆☆☆☆☆☆☆☆☆ Duck Duck Go (would give –1 for the 404 if I could)

Laying out French articles in HTML takes a long time


Above: Some French text in Lucire.
Regular Lucire readers will have seen a number of articles run in English and French (and one in Japanese) on our main website. Typographically, the French ones are tricky, since we have to distinguish between non-breaking spaces and non-breaking thin spaces, and as far as I know, there is no code for the latter in HTML. Indeed, even with a non-breaking space, a browser can treat it as it would a regular space.

So what’s our solution? Manually, and laboriously, putting in <NOBR> tags around the words that cannot be broken. It’s not efficient but typographically, it makes the text look right and, unless we’ve missed one, we don’t have the problem of guillemets being left on a line by themselves without a word to attach to.

The language is set to fr in the meta tags.

Among our French colleagues, I have seen some go Anglo with their quotation marks and ignoring the traditional French guillemets. Others omit any thin spaces and, consequently, adopt the English spacing rules with punctuation. For some reason, I just can’t bring ourselves to do it, and maybe there is an easier way that we haven’t heard of. I hope nos lecteurs français appreciate the extra effort.

Putting the search engines through their paces


One more, and I might give the subject a rest. Here I test the search engines for the term Lucire. This paints quite a different picture.

Lucire is an established site, dating from 1997, indexed by all major search engines from the start. The word did not exist online till the site began. It does exist in old Romanian. There is a (not oft-used) Spanish conjugated verb, I believe, spelt the same.

The original site is very well linked online, as you might expect after 25 years. You would normally expect, given its age and the inbound links, to see at the top of any index.

There is a Dr Yolande Lucire in Australia whom I know, who I’m used to seeing in the search engine results.

The scores are simply for getting relevant sites to us into the top 10, and no judgement is made about their quality or relevance.
—I hate to say it, as someone who dislikes Google, but all of the top 10 results are relevant. Fair play. Then again, with the milliards it has, and with this as its original product, it should do well. 10/10
Mojeek might be flavour of the month for me, but these results are disappointing. Scopalto retails Lucire in France, so that’s fair enough, but disappointing to see the original site in fourth. Fifth, sixth, seventh, ninth and tenth are irrelevant and relate to the Spanish word lucir. You’d have to get to no. 25 to see Lucire again, for Yola’s website. Then it’s more lucir results till no. 52, the personal website of one of our editors. 5/10
—Considering it sources from Bing, it makes the same mistakes by placing the rarely linked up top, and in third. Fourth, ninth and tenth are irrelevant, and the last two relate to different words. Yola’s site is seventh, which is fair enough. 6/10
—Interesting mixture here. Strange, too, that comes up top. We own but it’s now a forwarding domain (it was once our link shortener, up to a decade ago). Seventh and ninth relate to the Romanian word strălucire and eighth to the Romanian word lucire. The tenth domain is an old one, succeeded a couple of years ago by Not very current, then. 7/10
—All relevant, as expected, since it’s all sourced from Google. 10/10
Virtual Mirage
—I don’t know much about this search engine, since I only heard about it from Holly Jahangiri earlier today. A very good effort, with only the ninth one being irrelevant to us: it’s a paper co-written by Yola. 9/10
—This is the Russian version. All are relevant, and they are fairly expected, other than the ninth result which I’ve not come across this high before, although it still relates to Lucire. 10/10
—How Bing has slipped. There are sites here relating to the Spanish word lucirse and to Lucira, who makes PCR tests for COVID-19. One is for Yola. 7/10
—For a Bing-licensed site, this is even worse. No surprise to see gone here, given how inconsistently Bing has treated it of late. But there are results here for Lucira and a company called La Cire. The Amazon link is also for Lucira. 3/10
—The sites change slightly if you use the search box at The Reverso page is for the Spanish word luciré. Sixth through tenth are irrelevant and do not even relate to the search term. Eleventh and twelfth are for and, so there were more relevant pages to come. The ranking or relevant results, then, leaves something to be desired. 5/10
Duck Duck Go
—Well, at least the Duck puts up top, and the home page at that (even if Bing can’t). Only four relevant results, with Lucire Men coming in at tenth. 4/10
—For the new entrant, not a bad start. Shame about the smaller index size. All of these relate to us except the last two, one a dictionary and the other referring to Yolande Lucire. 8/10

The results are surprising from these first results’ pages.
★★★★★★★★★★ Google
★★★★★★★★★★ Yandex
★★★★★★★★★★ Startpage
★★★★★★★★★☆ Virtual Mirage
★★★★★★★★☆☆ Brave
★★★★★★★☆☆☆ Baidu
★★★★★★★☆☆☆ Bing
★★★★★★☆☆☆☆ Swisscows
★★★★★☆☆☆☆☆ Mojeek
★★★★☆☆☆☆☆☆ Duck Duck Go

It doesn’t change my mind about the suitability of Mojeek for internal searches though. It’s still the one with the largest index aside from Google, and it doesn’t track you.

Facebook saves private medical information despite saying it gets scrubbed


As embedding from Mastodon is not working tonight, I’ll copy and paste Per Axbom’s post:

Nice bit of reporting from Swedish Radio. They built an online fake pharmacy and activated Facebook advertising tools. Thousands of simulated visits to the pharmacy were made each day, and the reporters could see all the sensitive, personal information being stored by Facebook.

Facebook sent no warnings to the pharmacy, despite saying they have tools in place to prevent this from happening.

A few weeks ago they revealed how this was happening with real pharmacies.

He links this article from Sveriges Radio.

So, how long has it been since Cambridge Analytica? We can safely conclude that this is all by design, as it has been from the start.

July 2022 gallery


Here are July 2022’s images—aides-mémoires, photos of interest, and miscellaneous items. I append to this gallery through the month.

Pirate sites, content mills and splogs exist because of Google


In chatting to Alexandra Wolfe on Mastodon about the previous post, I had to draw a sombre conclusion. If it weren’t for Google, there’d be no incentive to do content mills or splogs.

I replied: ‘People really are that stupid, and itʼs all thanks to Google. Google doesn’t care about ad fraud, and anyone can be a Google publisher. So scammers set up fake sites, they have a script trawling Google News for stories, and they have another script that rewrites the stories, replacing words with synonyms. Google then pays them [for the ads they have on their sites]. Every now and then they get someone like me who tries to look after our crew.’

Google is the biggest ad tech operator out there. And over the years, I’ve seen them include splogs in Google News, which once was reserved only for legitimate news websites. And when we were hacked in 2013, the injected code looked to me like Google Adsense code. You could just see this develop in the 2000s with Blogger, and it’s only worsened.

Have a read of this piece, which quotes extensively from Bob Hoffman, and tell me that Google doesn’t know this is happening.

Google is part of the problem but as long as they keep getting rich off it, what motive do they have to change?

Speaking of ad fraud, Bob Hoffman’s last couple of newsletters mentions the Association of National Advertisers, who reported that ad fraud would cost advertisers $120 milliard this year. Conveniently enough for the industry, the ANA’s newsletter has since disappeared.
I still haven’t got into programmatic or header bidding or all the new buzzwords in online advertising, because I don’t understand them. And as it’s so murky, and there’s already so much fraud out there, why join in? Better buying simple ads directly with websites the old-fashioned way, since (again from Hoffman, in the link above):

Buying directly from quality publishers increases the productivity of display advertising by at least seven times and perhaps as much as 27 times compared to buying through a programmatic exchange.

Everyone wins.


Ad tech drives money to the worst online publishers. Ad tech’s value proposition is this: we will find you the highest quality eyeballs at the cheapest possible locations. Ad tech can do this because your web browser and mobile platform are vulnerable to a problem called ‘data leakage’ where your activity on a trusted site is revealed to other companies … If you’re a quality online publisher, ad tech is stealing money from you by following your valuable audience to the crappiest website they can be found on, and serving them ads there instead of on your site.

In other words, Google et al have an incentive to give ads to sploggers, who are getting rich off the backs of legitimate, quality publishers. And as to the intermediaries, I give you Bob Hoffman again, here.

Facebook admits we’re experiencing a bug preventing us from managing Lucire’s page


Last night’s hour-waster was chatting to Facebook Business Support. No, that’s unfair. I was actually assigned an incredibly good rep who took me seriously, and concluded that Facebook did indeed have a bug which means, of all the pages I can manage, the one for Lucire is alone in not allowing me, or any of its admins, to do anything. How coincidental, after losing Instagram and Twitter for periods during 2021.

Ironically, one editor can—of course someone who is supposed to have fewer privileges can do more. Such is Facebook.

A few things I learned. There’s a Meta Business Suite, which a whole bunch of pages got shoved into, whether you wanted it or not. My public page is there, for instance. It seems if you have Facebook and Instagram accounts for the same thing, you’re going to be in there.

Despite the two-factor authentication discussed in the previous post, I actually can get into the Business Suite, via another page I administer for a friend. From there I can get to Lucire’s tools.


I don’t need two-factor authentication for any of the other pages in there, including my own, and have full access.

Trisha, or Trish as she said I could call her, walked me through the steps, and asked me to get to the Suite page. Then she asked me to click ‘Create ad’, and I get this:

She asked me to check the account quality, and of course there are no issues:


She wrote: ‘Thanks for letting me know. It’s weird because I have checked all your assets here and it looks good. But, here’s what I suggest, Jack. We’ll need to report this to our Internal team so they can investigate. You might experience a bug or glitch.’

I theorized: ‘Just so you know, this page dates back to 2007 so maybe it is so old that Facebook’s servers can’t handle it?’

It wasn’t something she responded to, as she stayed on-subject, but it’s a theory worth entertaining, as it wouldn’t be the first time I’ve witnessed this.

So, for now, the one team member who can still go on Facebook for us posted this at my request:
All Lucire admins, all automated gadgets sending links to this page, and all Facebook-approved reposting sites, were blocked by Facebook on April 25. Therefore, till Facebook fixes this, there will be no more regular updates to this page other than a limited amount from one of our editors.

I doubt they’ll ever fix it, and two years ago I did say I wouldn’t really bother if Facebook went buggy and prevented us from updating again. Clearly I am bothering, as I know we have readers who use Facebook. But I have very little faith this will ever be fixed, since I have seen other reported bugs (some covered on this blog) get ignored for years, and this isn’t a fleeting bug, from what I can make out.

The lesson, as I have probably hinted at more than once, is never rely on a Big Tech service. The sites are so unwieldy that they get to a point where no one knows how to fix them. If earlier experiences are any indication, such as what I experienced at Vox, we have arrived at the end of Facebook pages.

Farewell to Drivetribe—and a reminder to keep your own copies


Not much of my old Drivetribe channel left now
Sadly, I was late to the demise of Drivetribe—though as some on Reddit point out, the brand still exists on other channels. But as for hosting the content themselves, that ended in January, and we content creators had till then to get our stuff off.

I had been checking in there less and less over 2021, which is a real shame. It had been a favourite site of mine—cars, and like-minded fanatics—but I guess it takes a lot more than a community to make a community.

Maybe it was the people I followed, but I never really got the right mix of news and entertainment. Others might beg to differ. I had little desire to follow the founders—Clarkson, Hammond, May, and Wilman (sorry chaps, I’ve watched you all in one shape or another since the 1990s, and given Wilman’s nude appearances on Top Gear, they are not necessarily shapes I want in my head)—so it was down to other content creators and contributors.

Twitter gives me some joy because of various car accounts there—Andrew at the Car Factoids and Andy with what must be a world-leading private brochure collection—and contributing seems a breeze. Drivetribe was somewhat hampered with a less-than-easy-to-use interface and somewhere along the line, in its first year if I recall correctly, the typography changed for the worse (at least to my eyes).

And like so many social networks, it was about keeping the content there in the hope it would generate money for the core business. It did indeed have a separate programme for creators, where they expected to share in the loot, but ironically after I was approved to join, I lost interest in contributing. Maybe it was because I had my own sites that I could work on. Autocade eats up spare time with each model taking a good 15 minutes on average to illustrate, research and write.

Anything I wrote for Drivetribe exclusively, and there were a few pieces, is practically toast. There may be a few links on the Wayback Machine, but the rest is online history. It’s hardly their fault: the closure was covered in automotive media extensively, although I never received any emails about it. It’s a lesson once again to ensure that you keep copies of your own content; in my case, I might still have them in WordPerfect format on a DVD-ROM somewhere. Relevant ones appeared in Lucire and Lucire Men.
Speaking of hosting your own stuff, I wonder if this is what the future holds.

This comes at a time when another Tweeter I follow has lost his Instagram account for no reason he can fathom, and I shared with him that I wouldn’t mind hosting my own photos on this very site. Instagram is a once-every-few-months network for me now, at least when it comes to posting on my personal account. (I’ll look at it more for Lucire.) If John is right, we could be looking at a separation again: those who can host their own will, and those who can’t, rely on the mass services. There could be less interaction between groups of people, but then the social networks only have themselves to blame for fostering toxicity. We are only human: we found others to interact with and learn from in the early 2000s before Facebook and Twitter, and we can again. We might even find it more productive as we claw our time back from those services.

And if it’s about traffic, each post I make here gets multiples more views than most things I’ve posted to Instagram. Seven hundred is pretty normal. Is there any point, then? The negatives seem to outweigh the positives, and this becomes truer every day. You’d be a mug to want to buy one of these services in 2022.

Letter-writing is not just a lost art


I hate sounding like a cranky old fella but there is definitely a generational divide now on letter-writing. I imagine something had to give if people are using apps to message each other (not great from a business point-of-view if maintaining an official record becomes this much harder). From a thread on Reddit:

How to end social media censorship


Kristina Flour/Unsplash
This Twitter thread by Yishan Wong is one of the most interesting I’ve come across. Not because it’s about Elon Musk (who he begins with), but because it’s about the history of the web, censorship, and the reality of running a social platform.

Here are some highlights (emphases in the original):

There is this old culture of the internet, roughly Web 1.0 (late 90s) and early Web 2.0, pre-Facebook (pre-2005), that had a very strong free speech culture.

This free speech idea arose out of a culture of late-90s America where the main people who were interested in censorship were religious conservatives. In practical terms, this meant that they would try to ban porn (or other imagined moral degeneracy) on the internet …

Many of the older tech leaders today … grew up with that internet. To them, the internet represented freedom, a new frontier, a flowering of the human spirit, and a great optimism that technology could birth a new golden age of mankind.

Fast forward to the reality of the 2020s:

The internet is not a “frontier” where people can go “to be free,” it’s where the entire world is now, and every culture war is being fought on it.

It’s the main battlefield for our culture wars.

Yishan points out that left-wingers can point to where right-wingers get more freedom to say their piece, and that right-wingers can point to where left-wingers get more. ‘Both sides think the platform is institutionally biased against them.’

The reality:

They would like you (the users) to stop squabbling over stupid shit and causing drama so that they can spend their time writing more features and not have to adjudicate your stupid little fights.

That’s all.

They don’t care about politics. They really don’t.

He concedes that people can be their worst selves online, and that the platforms struggle to keep things civil.

They have to pretend to enforce fairness. They have to adopt “principles.”

Let me tell you: There are no real principles. They are just trying to be fair because if they weren’t, everyone would yell louder and the problem would be worse …

You really want to avoid censorship on social networks? Here is the solution:

Stop arguing. Play nice. The catch: everyone has to do it at once.

I guarantee you, if you do that, there will be no censorship of any topic on any social network.

Because it is not topics that are censored. It is behavior.

I think Yishan’s right to some degree. There are leanings that the leaders of these social networks have, and I think that can affect the overall decisions. But he’s also right that both left and right feel aggrieved. I warned as much when I wrote about social media and their decision about Donald Trump in the wake of the incidents of January 6, 2021. I’ve seen left- and right-wing accounts get taken down, and often for no discernible reason I can fathom.

Generally, however, civil discourse is a perfectly fine way to go, and for most things that doesn’t invite censorship or account removal. Wouldn’t it be nice if people took him up on this, to see what would happen?

Sadly, that could well be as idealistic as the ‘new frontier’ which many of us who got into the dot com world in the 1990s believed in.

But maybe he’s woken up some folks. And with c. 50,000 followers, he has a darn sight better chance than I have reaching just over a tenth of that on Twitter, and the 1,000 or so of you who will read this blog post.
During the writing of this post, Vivaldi crashed again, when I attempted to enter form data—a bug that they believed was fixed a few revisions ago. It appears not. I’ll still send over a bug report, but everything is pointing at my abandoning it in favour of Opera GX. Five years is a very good run for a browser.

