Posts tagged ‘Lucire Men’

How Google fares in a site: search if your site is all PHP—Wordpress users beware


After the last post, you may be thinking: surely if my site was entirely PHP, Google wouldn’t have a problem identifying which were the most important articles? They are the biggest search engine in the world and all that data would ensure that they knew how to rank the dynamic pages properly. They would have some idea, based on who is clicking on what, which pages should be placed up high in a site: search.

You might have been right in the past, but in 2023, you’d be wrong.

Just as with Lucire and my personal site (both of which are sites with a mixture of static HTML and dynamic PHP pages), Google does a terrible job. With Lucire, PHP drives the news section. The top PHP page Google shows with comes in second, which is great, and it’s the main news page on the site. But beyond that, not a single PHP-driven news article appears in the top 100 results on Google. There are repeat pages, a bug that Google has introduced as it follows Bing in padding out the results.

There are some PHP pages though, namely pages containing tags. But even those make no sense in terms of ordering. The first is in no. 11, a PHP-generated index of pages tagged with Whangarei. I repeated the search and this result fell to no. 16. Other tags on the second page of results were for 2020 and Bulgari. There is no public tag cloud any more, but I seem to recall the actual top one is Lucire, followed by fashion.

So even on tags, Google gets the ranking very wrong in a site: search, and there is nothing on the site that would lead its spider to think Whangarei needed to be so high. The visits to the site do not bear this out as well.

When you look at a site like Lucire Rouge, which is entirely PHP, it’s an incredible surprise to find that its top pages are dominated by tags, categories and authors’ pages, plus many contents’ pages. These follow the home and contact pages. Again, there are no article pages in the top 100. You can find out for yourself using Or better yet, try out a site you know and see if it follows the same pattern.


I recall there was a Wordpress SEO plug-in that helped you manage these, ridding you of the tag and other contents’ pages, but Google never needed its hand held so badly before. And if I can’t remember what that plug-in was or how it worked, how does the regular punter? (I have a feeling the plug-in we used became obsolete, or was it inside Wordpress as standard? Your guess is as good as mine. I couldn’t find it when I wrote this blog post, but then a lot of websites are no longer intuitive to use.)

With Lucire Men, also entirely PHP, you have to get to page 5 and result no. 43 before you encounter the first article; the first 42 are tag pages or other forms of contents’ pages. Nos. 44, 45 and 46 are also articles. Then it’s back to the indices before page 6 shows all articles.

Individual dynamic pages or posts—those generated by Wordpress, for instance—now appear to be far too difficult for Google to handle. If your site uses Wordpress, expect Google to have difficulty with it now; it certainly answers why this blog’s visits have fallen so badly if Google no longer shows posts up top in a site: search. It means it doesn’t really rate them, so how on earth would I expect them to show up in a search for the topics being covered?

It’s pretty disappointing to see Google search fall so quickly in such a short space of time.
Of course there are exceptions, and Google seems to do reasonably well on a search. That site, run on Mediawiki, is all PHP as well, and the results are today as I recall them ages ago. The ones up top have been pretty popular—certainly they are for models that are among the top pages on the site. And that’s how Google should behave. Goodness knows why it can’t handle Wordpress properly any more.

The only oddity here is that Google estimates it has 2,960 results. Mojeek, the search engine whose spider works properly and delivers results as expected, has 3,277. It’s probably the first site I’ve observed where Mojeek has more pages indexed than Google. Is it the beginning of the shift where Mojeek has a larger and more relevant index than Google?

PS., 9.42 p.m. UTC: Sure enough, it’s not just us. Quartz is pretty famous, and they’re run off Wordpress exclusively. Most of their top pages for are tag and author ones, too, though their first article, one which I couldn’t imagine would be their top story, appears as result no. 5. Quartz gets a ton of traffic, but Google can’t do right by them, either.

You may also like

Tags: , , , , , , , , , ,
Posted in internet, media, publishing, technology, USA | No Comments »

Why paywalls are getting more prevalent; and The Guardian Weekly rethought


Megan McArdle’s excellent op–ed in The Washington Post, ‘A farewell to free journalism’, has been bookmarked on my phone for months. It’s a very good summary of where things are for digital media, and how the advent of Google and Facebook along with the democratization of the internet have reduced online advertising income to a pittance. There’s native advertising, of course, which Lucire and Lucire Men indulged in for a few years in the 2010s, and I remain a fan of it in terms of what it paid, but McArdle’s piece is a stark reminder of the real world: there ain’t enough of it to keep every newsroom funded.
   I’ll also say that I have been very tempted over the last year or two to start locking away some of Lucire’s 21 years of content behind a paywall, but part of me has a romantic notion (and you can see it in McArdle’s own writing) that information deserves to be free.
   Everyone should get a slice of the pie if they are putting up free content along with slots for Doubleclick ads, for instance, and those advertising networks operate on merit: get enough qualified visitors (and they do know who they are, since very few people opt out; in Facebook’s case opting out actually does nothing and they continue to track your preferences) and they’ll feed the ads through accordingly, whether you own a “real” publication or not.
   It wasn’t that long ago, however, when more premium ad networks worked with premium media, leaving Google’s Adsense to operate among amateurs. It felt like a two-tier ad market. Those days are long gone, since plenty of people were quite happy to pay the cheap rates for the latter.
   It’s why my loyal Desktop readers who took in my typography column every month between 1996 and 2010 do not see me there any more: we columnists were let go when the business model changed.
   All of this can exacerbate an already tricky situation, as the worse funded independent media get, the less likely we can afford to offer decent journalism, biasing the playing field in favour of corporate media that have deeper pockets. Google, as we have seen, no longer ranks media on merit, either: since they and Facebook control half of all online advertising revenue, and over 60 per cent in the US, it’s not in their interests to send readers to the most meritorious. It’s in their interests to send readers to the media with the deeper pockets and scalable servers that can handle large amounts of traffic with a lot of Google ads, so they make more money.
   It’s yet another reason to look at alternatives to Google if you wish to seek out decent independent media and support non-corporate voices. However, even my favoured search engine, Duck Duck Go, doesn’t have a specific news service, though it’s still a start.
   In our case, if we didn’t have a print edition as well as a web one, then online-only mightn’t be worthwhile sans paywall.

Tonight I was interested to see The Guardian Weekly in magazine format, a switch that happened on October 10.
   It’s a move that I predicted over a decade ago, when I said that magazines should occupy a ‘soft-cover coffee-table book’ niche (which is what the local edition of Lucire aims to do) and traditional newspapers could take the area occupied by the likes of Time and Newsweek.
   With the improvement in printing presses and the price of lightweight gloss paper it seemed a logical move. Add to changing reader habits—the same ones that drove the death of the broadsheet format in the UK—and the evolution of editorial and graphic design, I couldn’t see it heading any other way. Consequently, I think The Guardian will do rather well.

You may also like

Tags: , , , , , , , , , , , , , , , , , , , , , , , ,
Posted in business, culture, internet, marketing, media, New Zealand, publishing, UK, USA | No Comments »

Putting back allegedly “malicious” code: has Google caught up with reality?


Not a political post, sorry. This one follows up from the Google boycott earlier this month and is further proof of how the house of G gets it very, very wrong when it comes to malware warnings.
   As those who followed this case know, our ad server was hacked on April 6 but both my web development expert, Nigel Dunn, and I fixed everything within hours. However, Google continued to block any website linking to that server, including this blog—which, as it turned out, delayed my mayoral campaign announcement sufficiently for things to go out on the same day as the marriage equality bill’s final reading and Baroness Thatcher’s funeral—and any of our websites carrying advertising. Lucire was blacklisted by Google for six days despite being clean, and some of our smaller websites were even blocked for weeks for people using Chrome and Firefox.
   We insisted nothing was wrong, and services such as Stop Badware gave our sites the all-clear. Even a senior Google forum volunteer, who has experience in the malware side of things, couldn’t understand why the block had continued. There’s just no way of reaching Google people though, unless you have some inside knowledge.
   We haven’t done any more work on the ad server. We couldn’t. We know it’s clean. But we eventually relented and removed links to it, on the advice of malware expert Dr Anirban Banerjee, because he believed that Google does get it wrong. His advice: remove it, then put it back after a few days.
   The problem is, Google gets it wrong at the expense of small businesses who can’t give it sufficient bad publicity to shatter its illusory ‘Don’t be evil’ claim. It’s like the Blogger blog deletions all over again: unless you’re big enough to fight, Google won’t care.
   Last night, we decided to put back the old code—the one that Google claimed was dodgy—on to the Lucire Men website. It’s not a major website, just one that we set up more or less as an experiment. Since this code is apparently so malicious, according to Google, then it would be logical to expect that by this morning, there would be warnings all over it. Your browser would exclaim, ‘You can’t go to that site—you will be infected!’
   Guess what? Nothing of the sort has happened.
   It’s clean, just as we’ve been saying since April 6.
   And to all those “experts” who claim Google never gets it wrong, that the false positives that we netizens claim are all down to our own ignorance with computing, well, then, there’s proof that Google is fallible. Very fallible. And very harmful when it comes to small businesses who can lose a lot of revenue from false accusations. Even we had advertising contracts cancelled during that period because people prefer believing Google. One ad network pulled every single ad they had with Lucire’s online edition.
   People are exposed to its logo every day when they do a web search. And those web searches, they feel, are accurate and useful to them, reinforcing the warm fuzzies.
   Can we really expect a company that produces spyware (and ignores red-flagging its own, naturally) to be honest about reporting the existence of malware on other people’s websites? Especially when the code the hackers used on April 6 has Google’s name and links all over it?
   It can be dangerous, as this experience has illustrated, to put so much faith in the house of G. We’ll be steadily reintroducing our ad server code on to our websites. While we’re confident we’re clean, we have to wear kid gloves dealing with Google’s unpredictable manner.

You may also like

Tags: , , , , , , , , ,
Posted in business, internet, media, New Zealand, publishing, technology, USA | No Comments »