After the last post, you may be thinking: surely if my site was entirely PHP, Google wouldn’t have a problem identifying which were the most important articles? They are the biggest search engine in the world and all that data would ensure that they knew how to rank the dynamic pages properly. They would have some idea, based on who is clicking on what, which pages should be placed up high in a site: search.
You might have been right in the past, but in 2023, you’d be wrong.
Just as with Lucire and my personal site (both of which are sites with a mixture of static HTML and dynamic PHP pages), Google does a terrible job. With Lucire, PHP drives the news section. The top PHP page Google shows with site:lucire.com comes in second, which is great, and it’s the main news page on the site. But beyond that, not a single PHP-driven news article appears in the top 100 results on Google. There are repeat pages, a bug that Google has introduced as it follows Bing in padding out the results.
There are some PHP pages though, namely pages containing tags. But even those make no sense in terms of ordering. The first is in no. 11, a PHP-generated index of pages tagged with Whangarei. I repeated the search and this result fell to no. 16. Other tags on the second page of results were for 2020 and Bulgari. There is no public tag cloud any more, but I seem to recall the actual top one is Lucire, followed by fashion.
So even on tags, Google gets the ranking very wrong in a site: search, and there is nothing on the site that would lead its spider to think Whangarei needed to be so high. The visits to the site do not bear this out as well.
When you look at a site like Lucire Rouge, which is entirely PHP, it’s an incredible surprise to find that its top pages are dominated by tags, categories and authors’ pages, plus many contents’ pages. These follow the home and contact pages. Again, there are no article pages in the top 100. You can find out for yourself using site:lucirerouge.com. Or better yet, try out a site you know and see if it follows the same pattern.
I recall there was a Wordpress SEO plug-in that helped you manage these, ridding you of the tag and other contents’ pages, but Google never needed its hand held so badly before. And if I can’t remember what that plug-in was or how it worked, how does the regular punter? (I have a feeling the plug-in we used became obsolete, or was it inside Wordpress as standard? Your guess is as good as mine. I couldn’t find it when I wrote this blog post, but then a lot of websites are no longer intuitive to use.)
With Lucire Men, also entirely PHP, you have to get to page 5 and result no. 43 before you encounter the first article; the first 42 are tag pages or other forms of contents’ pages. Nos. 44, 45 and 46 are also articles. Then it’s back to the indices before page 6 shows all articles.
Individual dynamic pages or posts—those generated by Wordpress, for instance—now appear to be far too difficult for Google to handle. If your site uses Wordpress, expect Google to have difficulty with it now; it certainly answers why this blog’s visits have fallen so badly if Google no longer shows posts up top in a site: search. It means it doesn’t really rate them, so how on earth would I expect them to show up in a search for the topics being covered?
It’s pretty disappointing to see Google search fall so quickly in such a short space of time.
Of course there are exceptions, and Google seems to do reasonably well on a site:autocade.net search. That site, run on Mediawiki, is all PHP as well, and the results are today as I recall them ages ago. The ones up top have been pretty popular—certainly they are for models that are among the top pages on the site. And that’s how Google should behave. Goodness knows why it can’t handle Wordpress properly any more.
The only oddity here is that Google estimates it has 2,960 results. Mojeek, the search engine whose spider works properly and delivers results as expected, has 3,277. It’s probably the first site I’ve observed where Mojeek has more pages indexed than Google. Is it the beginning of the shift where Mojeek has a larger and more relevant index than Google?
PS., 9.42 p.m. UTC: Sure enough, it’s not just us. Quartz is pretty famous, and they’re run off Wordpress exclusively. Most of their top pages for site:qz.com are tag and author ones, too, though their first article, one which I couldn’t imagine would be their top story, appears as result no. 5. Quartz gets a ton of traffic, but Google can’t do right by them, either.