Posts tagged ‘Google’


Google’s advertising business is a negligence lawsuit waiting to be actioned

09.12.2022

Apparently the New Zealand government says Big Tech will pay a ‘fair price’ for local news content under new legislation.

Forget the newcomers like Stuff and The New Zealand Herald. The Fairfax Press, as the former was, was still running ‘The internet is scary’ stories at the turn of the century. What will Big Tech pay my firm? Any back pay? We have been in this game a long, long time. A lot longer than the newbies.

And what is the definition of ‘sharing’?

Because Google could be in for a lot.

Think about it this way: Google’s ad unit has enabled a lot of fake sites, scraped sites, spun sites, malware hosts, and the like, since anyone can sign up to be a publisher and start hosting their ads.

While Google will argue that they have nothing to do with the illegitimate usage of their services, some might look at it very differently.

Take the tort of negligence. To me this is classic Donahue v. Stevenson [1932] AC 562 territory and as we’re at 90 years since Lord Atkins’ judgement, it offers us some useful pointers.

Lord Atkin stated, ‘You must take reasonable care to avoid acts or omissions which you can reasonably foresee would be likely to injure your neighbour. Who, then, in law is my neighbour? The answer seems to be—persons who are so closely and directly affected by my act that I ought reasonably to have them in contemplation as being so affected when I am directing my mind to the acts or omissions which are called in question.’

If you open up advertising to all actors (Google News also opens itself up to splogs), then is it foreseeable that unethical parties and bad faith actors will sign up? Yes. Is it foreseeable that they will host content illegally? Yes. Will this cause harm to the original copyright owner? Yes.

We also know a lot of these pirate sites are finding their content through Google News. Some have even told me so, since I tend to start with a softly, softly approach and send a polite request to a pirate.

I’d say a case in negligence is already shaping up.

If Google didn’t open up its advertising to all and sundry, then there would have been far fewer negative consequences—let’s not even get into surveillance, which is also a direct consequence of their policy and conduct.

Do companies that are online owe a duty of care to internet users? I’d say this is reasonable. I imagine some smaller firms might find it more difficult to get rid of a hacker, but overall, this seems reasonable.

Was this duty of care breached? Was there causation? By not vetting people signing up to the advertising programme, then yes. Pre-Google, ad networks were very careful, and I had the impression websites were approved on a case-by-case, manually reviewed basis. The mess the web is in, with people gaming search engines, with fake news sites (which really started as a way of making money), with advertising making pennies instead of dollars and scam artists all over the show, can all be traced to Google helping them monetize this conduct. There’s your obiter dicta right there. (Thanks to Amanda for remembering that term after all these years.)

Google hasn’t taken reasonable care, by design. And it’s done this for decades. And damages must be in the milliards to all legitimate publishers out there who have lost traffic to these unethical websites, who have seen advertising revenue plummet because of how Google has depressed the prices and how it feeds advertising to cheap websites that have cost their owners virtually nothing to run.

Make of this what you will. Now that governments are waking up after almost two decades, maybe Big Tech is only agreeing because it fears the rest of us will figure out that they owe way, way more than the pittance they’ll pay out under these legislative schemes?

Anyone with enough legal nous to give this a bash on behalf of the millions of legitimate publishers, past and present, directly harmed by Google and other Big Tech companies’ actions?

Tags: , , , , , , , , , , , , , , , , ,
Posted in business, internet, media, New Zealand, publishing, technology, USA | No Comments »


November 2022 gallery

03.11.2022

Here are November 2022’s images—aides-mémoires, photos of interest, and miscellaneous items. I append to this gallery through the month.
 

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Posted in cars, culture, France, gallery, humour, interests, internet, marketing, New Zealand, politics, publishing, TV, typography, UK, USA | No Comments »


We should challenge monopolists, not do business with them at the exclusion of ethical parties

17.10.2022

Search engine Mojeek is doing no wrong in my book. Here’s its CEO Colin Hayhurst being interviewed by The New Era’s Jeffrey Peel, making complete sense, which is not something I can say about anyone speaking for Big Tech. We should be shunning monopolists if we truly value progress and innovation, or even a proper, factual debate. We even have laws about it that few seem to wish to enforce when it comes to Big Tech players. It’s well worth a watch.
 
I was disappointed to see that the Warehouse, our big retailer, specifically blocks Mojeek from searching its site. Google is fine. Explanations vary—but they include the theory that the Warehouse wants to get data from its users and Google can provide them.

I’ve written to the Warehouse as an account holder and received no reply. I decided to take it higher, to its chief digital officer, on October 3. As far as I know this email has been delivered, but there’s always a possibility I have her address wrong. Regardless, I am yet to hear back on any front, including social media where I had asked the Warehouse why they would wish to block a legitimate and far more ethical search engine. What does it say about your company when you choose to do business with someone as questionable as Google, yet you go out of your way to block a fully ethical and privacy-respecting business?

Dear Sarah:
 
I contacted the Warehouse through the customer service channels at the beginning of September and have yet to hear back.

As CDO I think you’re the right person to raise this with, though please refer it to a colleague if you aren’t.

I run Lucire Ltd. and have been a Warehouse account holder for some time. Our own foundations are in the digital space, with my having been a digital publisher since 1989. We’re always mindful that our activities promote a healthy online space, which means we keep a watchful eye on the behaviour of US Big Tech. (For instance, we removed all Facebook gadgets from our sites in 2018, prior to the Cambridge Analytica exposé, as we became increasingly concerned of the tracking exposure our readers were getting.)

Our internal search is now run by Mojeek, a UK-based search engine that has the largest index in the west outside of Google. It is also my default, having lost faith in Duck Duck Go after 12 years.

Other than the Warehouse’s home page, none of the contents of your company’s site appear in Mojeek. When I raised this with them, they tell me that Mojeek is very specifically blocked by the Warehouse. Neither they nor I can see any good reason a legitimate, independent search engine would be blocked.

I am told that inside your code is:
 
User-agent: MojeekBot
Disallow: /

 

As concerns over privacy grow, it seems a disservice that it’s blocked.

When I put this to other techs, they theorize that the Warehouse wants to track people via whatever data Google provides. I find this hard to believe. To what end? The amount of information that comes surely can’t outweigh overall accessibility to the website for those of us who have concerns over Google’s monopolistic behaviour and privacy intrusions.

Even if tracking were the reason, I would have thought there would be no great loss allowing a tiny percentage of people to come in via a Mojeek search result and browse the site—including customers like me who had the intent to see what you had in stock with a view to purchasing the item.

I genuinely hope this is something that will be looked into and that a New Zealand company I admire (one which is connected to me through a round-about way—I was educated by relatives of the Tindalls) isn’t party to upholding the Google monopoly.

Tags: , , , , , , , ,
Posted in business, internet, New Zealand, technology | No Comments »


Forget the 2010s and 2020s, Bing’s results are firmly in the 2000s now

09.10.2022

Immediately after blogging about Bing being able to pick up an article from 2022, Microsoft’s collapsing search engine has reverted back to being the Wayback Machine. There was just over a week of it living in the 2020s, but it seems it’s too much for them.

It’s back to, well, Bing Vista, for want of a better term. Of the 50 results (out of a claimed 120!) that it’s capable of returning for site:lucire.com, here is how it breaks down based on the publication year of the article. Since my last test, Bing has eliminated the 2018 and 2019 results (one page per year). We wouldn’t want to think it could deliver anything from the last decade, would we?
 
Bing
Contents’ pages ★★
1997
1998
1999
2000
2001 ★★★★★
2002
2003 ★★★★
2004 ★★★★
2005 ★★
2006 ★
2007 ★★★★★★★
2008 ★★
2009 ★★
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
 

There were 29 unique results, which means 21 were repeats—42 per cent! Bing says it had 120 results but really only had 29. To fill up the 50 it had to show 21 results multiple times!

Let’s see how Google fared for the first 50 results.
 
Google
Contents’ pages ★★★★★★★★★★
1997
1998
1999
2000
2001
2002 ★★
2003
2004 ★★
2005 ★
2006
2007 ★
2008
2009 ★
2010 ★★
2011 ★★★
2012 ★★
2013 ★★
2014 ★★★
2015 ★★
2016 ★★
2017 ★★
2018 ★★
2019 ★★★
2020 ★★★
2021 ★★
2022 ★★★★★
 
Google has moved again since we began looking at things. In an earlier test tonight, Google had two repeat results, which was a surprise. But I wasn’t able to replicate it when I did the one for the blog post.

No such issues at Mojeek, where every entry is unique. They really are more capable of delivering search engine results for site searches that are superior to the other two’s.
 
Mojeek
Contents’ pages ★★★★★★★★
1997
1998
1999
2000
2001
2002
2003
2004 ★
2005
2006
2007
2008
2009 ★
2010 ★★
2011 ★★
2012 ★★★
2013 ★★★★★
2014 ★★★
2015 ★★★★★
2016
2017
2018
2019 ★★★★
2020 ★★
2021 ★★★★★★★★★
2022 ★★★★★★
 
An improvement on our September 21 test, where Mojeek has managed to capture more 2020s pages as part of its top 50.

I won’t run the other search engines through this—I just wanted two points of comparison to highlight how ridiculous Bing remains, with the resultant effect on web traffic. It means Duck Duck Go, Qwant, Ecosia, Yahoo and others, which are also Bing, are just as compromised.

I might lay off them for a while as we know it’s crap and things aren’t going to change. Microsoft has firmly entrenched itself as a bunch of liars, like their other Big Tech counterparts.

Tags: , , , , , , , , , ,
Posted in internet, technology, USA | No Comments »


Startpage isn’t what I thought it was—but then Google does the opposite to what you think

05.10.2022

Startpage says it licenses Google’s results but gives us privacy. So, if you want Google-level, Google-biased results, but don’t want their tracking, you use Startpage.

Um, no. Let’s just take a random search for a screenwriter I once mentioned on this blog:
 


 

It’s quite a bit slower than Google, too. The results are usually geographically biased, even when you have the region switched off.

What’s curious is that, at the same location with the same IP address, I get six Google results on desktop and 16 on mobile. I’m not sure what the sense is in that.
 


 

I realize there are a lot of mobile users, but it seems strange to limit what can be found on the desktop version. Surely the opposite would make sense since not all sites are mobile-optimized?

It’s like Google Maps: for me, it’s not accessible on a cellphone any more (and hasn’t been for months—I discovered this when Amanda and I went on holiday at the end of August and there was no Google Maps anywhere in the country) but remains available on a desktop. The geniuses at Google do realize that people are more likely visiting Maps on a phone than sitting in their offices, right?

It doesn’t matter where I try, even from the office network: Google Maps is not available on my phone. The site is not just unavailable, it doesn’t even resolve (whether you use maps.google.com or google.com/maps).
 

 

Usually I find that expecting the opposite of what US Big Tech says is really useful.

Better use paper maps, because the satellites are often switched off and the map programs on your phone think you are nowhere!
 

 

Coming back to the original topic, Startpage says it pays Google for this.

Better ask for a refund, folks.

Tags: , , , , , , ,
Posted in internet, New Zealand, technology | No Comments »


Testing the search engines: Bing likes antiquity; most favour HTML over PHP

21.09.2022

Bing is spidering new pages, as long as they’re very, very old.

Last week, we added a handful of Lucire pages from 1998 and 1999. An explanation is given here. And I’ve spotted at least two of those among Bing’s results when I do a site:lucire.com search.

As a couple of newer pages have also shown up, I doubt there’s any issue with the template; and the home page now also appears, too. But, by and large, Bing is Microsoft’s own Wayback Machine, and most of the Lucire results are from the 1990s and early 2000s.

It got me thinking: do the other search engines do this, too? For years, Google grandfathered older pages and they came up earlier. (Meanwhile, searches for my own name still have this site, and the company site, down, having lost first and second when we switched from HTTP to HTTPS in March. Contrary to expert opinion, you don’t recover, at least not quickly.)

As Lucire includes the date of the article in the URL, this should be an easy investigation. We’ll only do the first 50 results as that’s all Bing’s capable of. I’ll try not to include any repeat results out of fairness. ‘Contents’ pages’ include the home page, the Lucire TV and Lucire print shopping pages, and tag and category pages.
 
Bing
Contents’ pages ★★★
1997
1998
1999 ★★★★
2000 ★
2001 ★★★★★★★★
2002 ★★
2003 ★★★
2004 ★★★★
2005 ★★
2006
2007 ★★★
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018 ★
2019 ★
2020
2021
2022
 
Google
Contents’ pages ★★★★★★★★★★★★★
1997
1998
1999
2000
2001
2002 ★★
2003
2004 ★★
2005
2006
2007 ★
2008
2009
2010 ★
2011 ★★★
2012 ★
2013 ★★
2014 ★★★
2015 ★
2016 ★★
2017 ★
2018 ★★★
2019 ★★★
2020 ★★★★★★★
2021 ★
2022 ★★★★
 
Mojeek
Contents’ pages ★★★★★★
1997
1998
1999
2000
2001
2002
2003
2004 ★
2005
2006
2007
2008
2009 ★
2010 ★★
2011 ★★
2012 ★★★
2013 ★★★★
2014 ★★★
2015 ★★★★★
2016 ★★★★★★★
2017 ★★★★★★
2018 ★★★
2019 ★★★★
2020 ★★★
2021
2022
 
Baidu
Contents’ pages ★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018 ★
2019 ★
2020
2021 ★★★
2022 ★
 
Yandex
Contents’ pages ★★★★★
1997
1998
1999 ★★★★★
2000 ★★★★★★
2001 ★★★
2002 ★★★
2003 ★★★
2004 ★
2005
2006
2007 ★★★★
2008 ★★
2009 ★★
2010 ★★★★
2011 ★★★
2012 ★★
2013 ★
2014 ★★
2015
2016
2017
2018
2019
2020 ★★★
2021 ★
2022
 

To me, that was fascinating. My instincts weren’t wrong with Bing: it’s old and it favours the old (two of the restored articles were indexed). From the first 50 results, 18 results were repeats—that’s 36 per cent. I’m of the mind that Bing is so shot that it can only index old pages that don’t take up much space. New ones have a lot more data to them, generally.

Google does a good job with the top-level and second-level contents’ pages, though there were a few strange tag indices. But the distribution is what you’d expect: people would search for more recent stories. I know we had some popular stories from 2002 that still get hit a lot.

Mojeek has a similar distribution, though it should be noted that you can’t do a blanket site: search. There must be a keyword, and in this case it’s Lucire. The 2016 pages form the mode, which I don’t have a huge problem with; it’s better than the 2001 pages, which Bing has over everything else.

Baidu’s one is crazy as individual stories are seldom spat out in the first five pages, the search engine preferring tag indices, though half a dozen later story pages do make it into its top 50.

Finally, Yandex leans toward older pages, too, including our most popular 2002 piece. It’s the 2000 stories it has the most of among the top 50, and there’s a strange empty period between 2015 and 2019. But at least there is a fairer distribution than Bing can muster.

The other query that I had was whether these search engines were biasing their results toward HTML pages, rather than PHP ones. If that’s the case, then it could explain Bing’s preference for the old stuff (Lucire didn’t have PHP pages till 2008; prior to that it was all laboriously hand-coded, albeit within templates.)
 
Bing
★★★★★★★★★★★★★★★★★★★★★★★★★ HTML
★ PHP
 
Google
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★ HTML
★★★★★★★★★ PHP
 
Mojeek
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★ HTML
★★★★★★★★★★★★★★★★★ PHP
 
Baidu
★★★★★★★★★★ HTML
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★ PHP
 
Yandex
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★ HTML
★★★★★★ PHP
 

I think we can safely say there’s a preference for HTML over PHP. Mojeek brings up a lot of HTML pages after the top 50, even though this sample shows the split isn’t as severe.

Our PHP pages are less significant though: they contain news stories, and these are often ones other media covered, too. But I would have thought some of the more popular stories would have made the cut, and here it’s Mojeek’s distribution that looks superior to the others’. It seems like it’s actually analysing the page content’s text, which is what you want a search engine to do.

Baidu’s PHP-heaviness is down to all the tag indices—rendering it not particularly helpful as a search engine.

On these two tests, Mojeek and Google rank best, and Yandex comes in third. Baidu and Bing are a distant fourth and fifth.

Tags: , , , , , , , , , , ,
Posted in China, culture, internet, media, publishing, technology, UK, USA | No Comments »


Google libels, and gets away with it

08.09.2022

This isn’t a dig at Red Points or Hearst this time, since I received an apology and they did what they said: the DMCA claim notices were withdrawn and they have revised their systems. If anything, Hearst SL wound up quite cordial, as their New York office has tended to be in my dealings with them.

This is a dig at Google who only today sent what appears to be the final confirmation that our URLs have been reinstated.

This sorry saga began on August 17 and essentially Google told people searching for various terms that we were thieves till today.

The fact this virtual monopoly can libel someone with impunity—and has done so for years—should disturb any right-thinking person.
 
Speaking of Google, we gave in and connected the revised about.shtml page on the Lucire website to a current page. This was a page we hadn’t linked since the 2000s, but kept coming up high on site:lucire.com searches on Google and, formerly, Bing.

Since I typically don’t use Google for searches, and have not done regularly for a dozen years, I had no idea until investigating the collapse of Bing’s index recently. (Itʼs still just as compromised, despite claiming it has a higher number of results for any given search. I see no real evidence of this.)

Admittedly, people might seek an ‘About’ page, so instead of their reading a 2004 page, we took the content from our licensing website and created a new one. The old one is linked from there, as it’s quite a novelty.

Tags: , , , , , , ,
Posted in business, internet, publishing, technology, USA | No Comments »


Bing hates novelty—it’s really Microsoft’s Wayback Machine

27.08.2022

Bing is still very clearly near death, as this latest site: search shows.
 

 

It manages a grand total of 10 pages from Lucire, and as outlined before, some are pages that have not been linked to for 17 years.

I purposely updated some of the pages Bing had in its limited capacity, and strangely, those have disappeared! Bing doesn’t want anything new, as it appears to be Microsoft’s Wayback Machine.

The fifth result here is a case in point. Some of you may recall lucire.com/about.shtml appearing in all the search engines, including Bing. This is a page last updated in 2004, with some final tweaks in 2012 (I assume for ad code; I don’t recall). It was a page that I decided I would stick on to a new template, since the search engines loved it so much. I copied the text from our licensing site. And, for the sake of online archæology, I put the 2004 page exactly as it was into a file called about-2004.shtml.

Bing must still be alive enough to spider and index the renamed page, but it rejects the revised about.shtml!

It’s similar to what I wrote in mid-August when I updated other ancient pages from the early 2000s: Bing rejected them, including a frameset that now pointed at the latest page!

You may be thinking: obviously, you are doing something wrong with your newer code, Jack, for Bing to favour the old stuff. But look at the fourth result: it’s from 2020, the one “new” page that Bing has managed to index and show. I don’t think we have anything wrong with our code if this page has made it in.

Google happily included the new about.shtml.

A search for Lucire itself on Bing now does include the home page, which is a new development in a search engine that’s limping along. So much for the earlier claim that there were issues with the page that prevented it from appearing.

Tags: , , , , , , , , , , , ,
Posted in internet, media, publishing, technology, USA | No Comments »


The Red Points saga: this might finally be resolved

24.08.2022

Nine days since the first DMCA notice was lobbed against us, the saga has finally reached the powers-that-be at Hearst SL.

And once it did, things began happening quickly. I’ve heard from their head of legal, and what he’s outlined to me seems like a good resolution to the whole saga.

He tells me some changes have been made to Red Points Solution SL’s processes, which I think is a good outcome if it saves others the grief of what I’ve had to deal with—especially while contending with publishing deadlines and the day-to-day running of a company. It was a bigger distraction than I would have liked to admit.

In a gesture of goodwill, I offered to set to private the two stories we published on the Lucire website over the whole affair.

I suggested to him that I update everyone here, since you might have thought that the disappearance of the two articles was down to Red Points!

I shudder to think what would have happened if I didn’t have contact email addresses for senior VPs at Hearst Communications, Inc. or former Lucire team members who wound up working for Hearst. Or how someone without a legal background specializing in IP would have felt. Not everyone would be in this position.

It’s still concerning to me that Google continues to state that results have been removed in site searches for us, and for the topics those articles covered. Basically, they’re saying we’re thieves, and I don’t think that’s fair dinkum. As Google works at a glacial pace, I assume the notices will eventually disappear once they receive Red Points’ withdrawals.

I’ve also received an apology from Red Points’ CMO. The gentlemanly thing to do is to accept it. It will be interesting to see how long it takes for Google to stop saying we stole stuff.

Tags: , , , , , , , , , , , , , ,
Posted in business, internet, media, New Zealand, publishing, technology, USA | No Comments »


Testing the seven search engines in the world

22.08.2022

After reading Mojeek’s blog post from last July, I learned there are only seven search engines in the world now. In other words, I was checking more search engines out in the 1990s. It’s rather depressing, especially as the search market is largely a monopoly with Google dominating it (and all the ills that brings), and Bing and its licensees (like Duck Duck Go) with their 6 per cent.

Knowing there are seven, I fed the site:lucire.com search into all of them to see where each stood.

The first figure is the claimed number of results, the second the actual number shown (without repeats removed, which Bing is guilty of).

I can’t use Brave here as its site search is Bing as well.

Yandex appears to be capped at 250 and Mojeek at 1,000, but at least they aren’t arbitrary like Google and Baidu. Baidu has a lot of category and tag pages from the Wordpress section of our site to bump up the numbers.
 
Gigablast 0/0
Sogou 19/13
Bing 243/50
Baidu 13,700/213
Yandex 2,000/250
Google 6,280/315
Mojeek 3,654/1,000
 

Frankly, more of us should go to Mojeek. It can only get better with a wider user base. Unlike Bing, it hasn’t collapsed. I know most of you will keep going to Google, but I just don’t like the look of those limits (not to mention the massive privacy issues).

Mojeek is now at 5,900 million pages, which must be the largest index in the west outside of Google.

Tags: , , , , , , , , , , , , , , , , ,
Posted in China, internet, publishing, technology, UK, USA | No Comments »