Jack Yan
Global  |  Leadership  |  Experience  |  Media
Blog  |  Contact
 
  Follow me on Mastodon Follow me on Linkedin Follow me on Weibo Subscribe to my blog’s RSS feed  

 

Share this page




Quick links


Surf to the online edition of Lucire
Autocade, the car cyclopædia

 




Add feed


 

The Persuader

My personal blog, started in 2006. No paid or guest posts, no link sales.



02.02.2023

Why we’ve dropped Disqus, and the shenanigans of the online ad world

When I first signed up to Disqus, there was the option to have no ads. But with Lucire we allowed them, because I figured, why not?

Disqus’s rules were pretty clear: you’d earn money on the ads shown, and once you got to US$100, they’d pay out.

The trouble is those ads made so little money it took ages to reach the threshold.

Last year, when looking at the revenue figures, I was surprised things had reset and we had only earned a few dollars. Where did the US$100 go? There was no record of a payout.

I began enquiring and it took them a while to respond. They said they would pay (what would have happened if I never asked?) but what hit our account was NZ$100.

In other words, 35 per cent short.

I guess they’re counting on people not chasing up NZ$35, and I’m wondering if it’s a worthwhile use of my time. Or maybe it’s better I write this blog post to warn others about Disqus.

Disqus either short-paid us by 35 per cent or they have no clue how currencies work. Either way, it doesn’t reflect well on their company.

Unsurprisingly, I began taking Disqus off our sites, which was what I had always planned to do once we got to US$100. Off it went from Lucire for starters, though on Autocade it had been quite useful. I had signed up early enough to have the no-ads option, so I left it, especially as we had great commenters like Graham Clayton from Australia, who has a wealth of knowledge about cars himself.

This week, we noticed the no-ads option had disappeared and the bottom of Autocade’s pages had turned into an ugly mess, at least on the desktop version. We already had our own ad in the footer, so we didn’t need multiple ones cheapening the site.

Not only did Disqus pay us short by 35 per cent last year, I discovered their ads don’t even pay. Yes, Disqus was included in our ads.txt. But here’s a site that gets 1,000,000 page views every quarter (roughly) and we had earned zip. Zero. Nada.
 

 

Once I understand how to update a Mediawiki database, we’ll have Mediawiki comments instead, and I’ve exported what we had from Disqus.

It’s been a bad run, but there you go.

Media.net also said they would drop publishers from certain countries, without naming them. That was fine by me since they also had odd discrepancies between what I knew to be the traffic and what they recorded. At one point, the Media.net ad code was hard-coded on Autocade’s pages, and still they were recording a minuscule amount of traffic.

With time zone differences (their person was in India) we never solved it.

Maybe an inordinate amount of people use ad blockers?

We had till February 28 to remove their code but I took it off as well—no point dragging out yet another non-paying service.

It really feels like yet another area where Google has wrecked the advertising ecosystem for legitimate publishers. Oh for the days when there was more quality control over where ads appeared.
 
Ten years ago, we were hacked. That is a story in itself, which I documented at the time, along with Google’s failings. What also struck me was that the hack used what appeared to be Google Adsense code:
 

 

I had come across fake ads taking you to malware sites before, even with legitimate ad networks. (I still remember seeing a fake ad for a job-seeking website that wound up on our sites in April 2008.) But for some reason in 2013 it still seemed strange, since I didn’t deal with Google and some legit ad networks were still hanging on.

However, I noted on April 7, 2013, when researching what had happened, that it was entirely possible. And Google makes money no matter what.

I wrote: ‘The publisher’s site gets blacklisted and it takes days for that to be lifted, so the earnings go down. Who gains? The hackers and Google.’

The quotations I included in the 2013 post are sobering, with other publishers negatively affected by Google’s systems and inaction.

This week, almost 10 years later, I came across this.
 

 

Google, still useless after all these years. But hey, as long as they’re making money, right? Because the rest of us sure as heck aren’t, at least not through anything they touch. Their core business is a negligence lawsuit just waiting to happen.


Filed under: business, China, internet, marketing, media, publishing, technology, UK, USA—Jack Yan @ 21.29

30.01.2023

Is Microsoft trying to stem its losses from Bing?

If Appledystopia is right in its 2020 article, Microsoft loses US$1·5 milliard per annum on Bing. So maybe that explains why it’s worsened so much. Microsoft might well be finding ways to cut its losses, and servers cost money. Pity that none of the Bing clones are saying anything, not even Duck Duck Go’s usually vocal CEO.

I’m glad I discovered Mojeek when I did. We lost some traffic with Duck Duck Go’s near-dead internal search on Lucire, and overall I suspect everyone has lost traffic with Bing dying. With Google now also faltering (they still make plenty from the human farms, but you have to wonder just why it has worsened, even for existing sites), then it’s important that alternative, growing search engines—that’s engines, not services (so you can discount Ecosia, Neeva, Qwant, Duck Duck Go, and many others)—get our support.

There’s really only Mojeek in the occident with a growing index, regularly requiring new servers. If you aren’t anti-Russian, there’s Yandex; and China of course has Baidu. Brave and Yep are making great efforts but their indices are still small, though Yep can do better than Bing on some sites.


Filed under: business, internet, marketing, media, New Zealand, politics, publishing—Jack Yan @ 09.32


Nice try, Marissa Mayer, but no conversion

I had a chuckle at Marissa Mayer saying that Google results are worse because the web is worse.

As I’ve shown with a site:lucire.com search, which is a good one since our site pre-dates Google (just), Google is less capable of providing the relevant pages for a typical search.

I know how web spiders work in theory, and there’s no way that 2002 framesets are coming up in a 2023 crawl. We haven’t linked to those pages for a long, long time. But Google is throwing those into the top 10.

And we can extend this argument: Google, through its advertising, incentivized the creation of the very crap polluting the web.

Mayer said, ‘I think because there’s a lot of economic incentive for misinformation, for clicks, for purchases.

‘There’s a lot more fraud on the web today than there was 20 years ago.’

What’s the bet that these fraudulent pages are carrying Google ads?

As Don Marti, who knows a lot more about this than I do, said to me: ‘It’s all about moving traffic and ads away from sites that people want, and that advertisers want to sponsor, to places where Google gets a bigger % of the ad money (even if they’re on the sketchy side)’.

I think all this was foreseeable, and one could prove negligence on Google’s part. I still remember a time when established publishers like me wouldn’t join Google’s ad programmes because they were seen as an advertising service for second-rate (or worse) sites. They would appear on places like Blogger, which Google wound up buying.

Then the buggers wound up monopolizing the area, and things got worse for digital publishers as the ad rates got lower and lower—and, as Don notes, the money can find its way to the bottom feeders.

So Google does have a problem, and it is also the cause of a problem. Maybe breaking it up will solve some of them, and I’m glad the US Department of Justice is finally courageous enough to do something about it.
 
A spot-on insight from Brenda Wallace earlier today on Mastodon.
 

 

 
An irrelevant side note: it turns out the previous post was the 1,234th on this blog.


Filed under: culture, internet, media, publishing, technology, USA—Jack Yan @ 01.53

26.01.2023

Confused Google doesn’t understand email preferences

As if to prove my point about lies, Google spammed me right after my last post. In its footer:
 

 

No, I didn’t. I logged into Google to see my settings. Sure enough:
 

 
As I always say, when it comes to computers, I’m right, they’re wrong. I have a better memory.

I unsubscribed:
 

 
But I’m not signed up to ‘News and tips’. Not on any menu. Remember, Google? What’s the point of managing my preferences when you don’t know how to use them?
 

 

Basically, they’ll spam you when they want regardless of what your settings are.

Where’s the accountability here? Or are they desperate to shore up things now that the US Department of Justice has suddenly discovered it has some balls to take them on in an antitrust case?


Filed under: business, internet, technology, USA—Jack Yan @ 00.43


Now Google is worsening on a site: search: framesets from the early 2000s are in the top 10

This was never supposed to become a search engine blog, but like the Facebook “malware scanner†(or was that scammer?) and Google lying about its Ads Preferences Manager, I was forced to investigate when no one in the media (or, for that matter, the wider internet) did.

And over the years, those posts really helped people and exposed some wrongdoings.

Hence the latest obsession, about Bing, because no one seems to have noticed how Microsoft’s search engine is behaving as though someone at Redmond is unplugging servers left, right and centre.

Someone on Reddit suggested I try Kagi, which is a paid search engine—but from what I can tell, it’s a meta-search (the person who told me about it confirmed this, as did an earlier review).

I’ve seen meta-searches for decades, and admittedly Kagi is the prettiest of them all, but because it’s pulling from Bing and Google, it suffers from the limitations of both, especially the former.

We already have seen how Bing basically favours antiquity over currency, at least where Lucire is concerned, so Kagi’s results contain, in their top 10, pages that have not been updated (or linked) since the mid-2000s. When the Google-sourced results are factored in, it looks a bit better (since there are pages from the 2010s and 2020s), but they still aren’t the most relevant (since it seems Google has been faltering somewhat on site: searches, too).

Here’s a screen shot from Kagi. Results 1, 6 and 7 are current; result 3 is from the early 2010s; results 2, 4, 5 and 8 are framesets from the 2000s; result 9 is from 2014 and hasn’t been linked since then; the remainder are stories which can still be found through spidering but date from between 2011 and 2016.
 

 

Since it’s a meta-search, I decided to peer into Google and its top 10 do not look good, either. As I don’t tend to use Google, and the recent tests were about grabbing the number of search results, or analysing their currency, I hadn’t drilled down on a site:lucire.com search for a while.

Let’s see how they look today.
 


 
Surprisingly bad. Results 1 and 2 are current; results 3, 4 and 5 are framesets from the early 2000s that have not been linked since then; result 6 is from 2005 and has not been linked since then; result 7 is a 2011 story; result 8 is a 2022 story; result 9 is a 2016 story; and result 10 is a 2011 story.

In other words, the Google top 10 has changed probably due to their algorithm, but I wouldn’t call these relevant to what searchers seek. I could understand the old about.shtml staying in the top 10 despite its antiquity, but some of these top-level pages are really old. Framesets? Seriously?

Result 11 is repeated, which is also odd, while results 14 and 15 are tag pages from the Wordpress part of the site. The 15th is for Whangarei, not exactly the fashion centre of the world.

Google’s fall could explain why these blog posts have suffered traffic-wise as its search results are seriously irrelevant; there’s no connection to the pages’ popularity, either. It’s really beginning to feel like the Wayback Machine there, too.

Mojeek still makes more sense, since the search there requires a term, i.e. site:lucire.com lucire, so naturally it gives you pages containing the word Lucire more.
 

 

Result 1 is our home page (makes infinite sense!); result 2 a current top-level contents’ page; result 5 is the main page from Lucire TV; while the rest are stories that have the word Lucire contained in them more than what is typical for our site.

It looks like the US search engines are faltering while Mojeek is getting better. What an interesting development. I didn’t have worsening Google search on my 2023 bingo card.
 
Incidentally, for this website, Google still places my mayoral election pages from 2013 in its top 10; while Mojeek links the home page, the blog, a mixture of posts from 2009, 2020, 2021 and 2022, a transcript of a 2008 speech, and a tag page from 2010. Bing has pages from 2003 and 2012, but also some current top-level pages and, amazingly, three blog posts that are likely to be relevant (two of them critical about Bing from 2022 and 2023, and a 2021 post about Vodafone). In other words, Google has done the worst, in my opinion. Bing only has 10 pages so it has the smallest index but what it showed was surprisingly good! That leaves Mojeek, again, as delivering the best balance of relevance and index size.


Filed under: internet, publishing, technology, UK, USA—Jack Yan @ 00.15

23.01.2023

‘Google … broke the web’

Nice to see I’m not the only one who sees Google for what it is today. Warning: coarse language.
 

 

What’s bizarre is a reply I wrote largely in agreement (and had a few likes to) has vanished. Maybe some Google lovers didn’t like what I wrote?

Sometimes I can make the point better the second time around.

Strange, a reply I wrote in agreement has vanished.

Basically my earlier point was that Google has also destroyed a lot of legitimate publications’ earnings through depressing ad prices, diverting income to splogs, content mills and spun sites. Not to mention taking a decent cut for itself.

The whole enterprise is a massive con.

From a legal POV I would even say it was all foreseeable and a negligence lawsuit waiting for someone to take it on. It would be great to close it down.

The original reply linked to this post, which is also saying the emperor has no clothes—except this time it’s applied to Google. If Googlers are worried about that, then maybe I’ve cut very close to the chase. The one part which, when attacked, destroys the entire corrupt system.
 

 
PS.: Don Marti expresses my point far better than I did.
 


Filed under: business, internet, media, publishing, technology—Jack Yan @ 11.02

19.01.2023

For most sites, Bing continues to shrink


The New York Times’ presence on Bing has plunged back to the thousands—it was 2,723 on August 2
 
Back in July, I ran site: searches on a small range of websites to see just how bad things had got with Bing.

In January, I can report some have gone worse. And back in July it was already pathetic.

The first figure below is from today, the parenthesized figure from July.

Remember that Mojeek is the only party that appears to report these figures honestly. Bing repeats results from page to page—around 40 per cent from the searches I’ve done with site:lucire.com. Google will show a few hundred so it’s anyone’s guess. I prefer Mojeek’s 1,000 cap and that works particularly well for the Lucire site.
 
Die Zeit (zeit.de)
Mojeek: 5,279 (4,796)
Google: 2,590,000 (2,600,000)
Bing: 6,010 (3,770)
 
Annabelle (annabelle.ch)
Mojeek: 882 (405)
Google: 14,000 (11,700)
Bing: 25 (105)
 
Holly Jahangiri (jahangiri.us)
Mojeek: 299 (222)
Google: 510 (738)
Bing: 10 (49) but reports 2
 
The Gloss (thegloss.ie)
Mojeek: 2,615 (1,968)
Google: 23,000 (19,200)
Bing: 71 (71)
 
The New York Times (nytimes.com)
Mojeek: 3,547,405 (2,823,329)
Google: 42,800,000 (36,200,000)
Bing: 5,170 (1,190,000)
 
Lucire (lucire.com)
Mojeek: 3,529 (3,572)
Google: 4,940 (6,050)
Bing: 10 (50)
 
The Rake (therake.com)
Mojeek: 1,382 (1,443)
Google: 10,900 (11,500)
Bing: 10 (49)
 
Travel & Leisure (travelandleisure.com)
Mojeek: 11,222 (9,750)
Google: 21,000 (28,100)
Bing: 15,100 (220)
 
Microsoft (microsoft.com)
Mojeek: 1,887,288 (1,748,199)
Google: 120,000,000 (122,000,000)
Bing: 340,000 (14,200,000)
 
Detective Marketing (detectivemarketing.com)
Mojeek: 591 (579)
Google: 835 (998)
Bing: 10 (51)
 

There we have it: some rises at Bing for Die Zeit and Travel & Leisure, steady at The Gloss, but notable falls at The New York Times (back into the thousands, down from millions) and Microsoft’s own website (340,000, down from over 14 million). If you’re an independent publication, your presence on Bing is not rosy, with Annabelle, Lucire and The Rake netting between 10 and 25 despite thousands of pages on each site; while my friend Stefan Engeseth’s Detective Marketing site is also down to 10 from an already low 51.

I know from Mojeek’s blog that they keep plugging hard drives and servers to cope as their index expands. I can only assume from these numbers that Microsoft is unplugging them though they seem to look after you more if you’re an establishment website from a big company.
 
PS.: Here’s another way of looking at the data, factoring in the round of tests I did on August 2.


Filed under: internet, media, publishing, technology, USA—Jack Yan @ 23.18


The emperor has no clothes, so Microsoft does what little it can do

When you’ve been saying the emperor has no clothes for the last few months—and on the emperor’s forums—I shouldn’t be surprised we are at this point now.

Bing is virtually dead, and they don’t want me probing Bing Webmaster Tools (which are largely useless) about my own sites to show up even more of their BS.
 

 

As moves go, this is pretty daft. I mean, it was pretty useless before, so now I wonder how much more BS there is. I guess whomever is running Bing wants to confirm to me that Bing is dead along with the rest of it.


Filed under: internet, technology, USA—Jack Yan @ 20.26

18.01.2023

Marking galleries private today

Along came Copytrack again yesterday, identifying an image that they allege we stole and put on Lucire’s website. And once again I had to go back through old emails—only 11 years this time, not 13 like the last—to retrieve the email to prove that I had the correct licence to publish it, and that and the download page where I got it (it’s one of the most famous fashion labels in the world and knowing their budgets, they’ve paid for press). You wonder why they don’t whitelist legitimate publications.

It’s all very well for them to use their automated systems but I have to get the DVD archive manually. I’m just incredibly fortunate that I’ve kept every email since the 1990s.

On that note, I’ve marked most of the gallery entries on this blog as private today. Pretty much every image in the gallery I know to be either licensed for press use or is a publicity pic. But some have come via social media. I simply recognized them to be the press images because I have a photographic memory, and, for fun, I’ve added them to the gallery. Even though legally I have numerous defences, and I’m pretty sure I’d prevail in case of any legal claim, for a personal blog it’s just too much of a hassle when these so-called copyright services come knocking. I’ll do the hunt for work but I’m not being paid to blog. I know a lot of you enjoyed those gallery posts but they’re going to be pretty limited moving forward.

There are plenty of nice pics at Lucire—feel free to pop by there for a gander.


Filed under: gallery, internet, media, New Zealand, publishing, technology—Jack Yan @ 01.28

16.01.2023

How the search engines fare on a site: search here

Time to do some analysis on the age of the search results for this site through the search engines. I’m curious about the drop in hits. ‘Contents’ pages’ also include static pages and, in Bing’s case, PDFs. (PS.: For clarification, a contents’ page would include a Wordpress tag page, or a page for a set month containing all that month’s posts.)
 
Mojeek
Contents’ pages: ★★★★★★★★★
2002
2003
2004
2005
2006 ★★
2007 ★
2008 ★★
2009 ★★★★★★
2010 ★
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020 ★★
2021 ★★★★★★★★★★★
2022 ★★★★★★★★★★★★★
2023
 
Interesting spread, and no problems indexing PHP pages (after 2010). Some repeat results, with Mojeek having both www.jackyan.com and jackyan.com versions of the same pages. I’m surprised at the gap between 2010 and 2020, though they do appear after the 50 mark.
 
Google
Contents’ pages ★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
 
Now that was a surprise. Only the static, HTML pages, with a lot of ex-Blogger indices (which were also HTML). Talk about being a Wayback Machine. No individual blog posts at all and a lot of really old stuff that isn’t even linked any more. I expected Yandex to do something like this, not Google.
 
Bing
Contents’ pages ★★★★★★★★★
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023 ★
 
Still bizarre. Bing claimed it had six results and delivered 10 on the first page. One blog post from 2023 makes it in here—it’s one attacking Bing and calling it near death. (Of the ones after the 3rd, it’s done marginally better, though it’s still hundreds off the norm.) During the course of the day, the 50-something results Bing had for site:jackyan.com has fallen to 10. Talk about decaying.

Interestingly, Bing gives 50 or so results on mobile—something I discovered this morning after compiling the above and before I pressed ‘Publish’ in Wordpress.
 
Yandex
Contents’ pages ★★★★★★★
2002
2003
2004
2005
2006 ★★★★★★★★★★★★★
2007 ★★★★★★★★★
2008 ★★★
2009 ★★★★★★
2010 ★★★★
2011 ★★
2012
2013
2014
2015
2016
2017
2018
2019 ★★
2020 ★
2021
2022
2023
 
Some repeated results and definitely in favour of static HTML pages (pre-2010) over dynamic ones.
 
Baidu
Contents’ pages ★★★★★★★★
2002
2003
2004
2005
2006
2007
2008
2009
2010 ★
2011 ★
2012
2013
2014 ★
2015
2016
2017 ★★★★
2018 ★★
2019 ★
2020 ★★★★★★★★★
2021 ★★★★★★★★★★★★★★★
2022 ★★★★★★
2023
 
Baidu gives the wrong date for a lot of results, and there was a repeated result, too. But a pretty good site search and far closer to what I expected I would see, since it’s the post-2010 blog posts that I thought were more significant. There were a few in 2006 that got me some international mainstream media coverage and appearances on Aljazeera English’s Listening Post in those early days, but the most read blog entries were from 2016.
 
Yep
Contents’ pages ★★★★
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014 ★★
2015
2016
2017 ★
2018
2019
2020 ★★
2021
2022 ★
2023
 
Not bad for a newbie in beta, spidering both static and dynamic (PHP) pages. Better than Bing’s mix for the 10 each delivers.

Gigablast delivers none.

I can’t say for sure what caused the traffic drop based on the above, since I haven’t documented one of these searches before. So I’ve nothing to compare it to, though my vague memory is that Google would have had some of my actual posts among the top 50. A lot of the pages it does have there aren’t that highly trafficked. Could we blame Google?

Sadly, I don’t have enough data to know for sure, but on the face of it, Google’s top 50 are anomalous, while Bing continues to demonstrate that it’s largely useless.
 
PS.: Just tried site:bing.com. Bing’s results were terrible, including some real estate searches for homes in France, lots of repeated results. Mojeek and Google delivered better results for site:bing.com than Bing did.


Filed under: business, China, internet, technology, UK, USA—Jack Yan @ 19.56

« Previous PageNext Page »