After years of using the web, I think I know a little about how web spidering works. The web spider hits your home page (provided it knows about it), then proceeds to follow the links on it. Precedence is given to the pages within your site that are linked most, or are top-level: in Lucire’s case, that would be the home page, and the HTML pages that come off it (the indices for fashion, beauty, travel, and lifestyle, among others). Weighting would be given to those linked more: with so many fashion stories on the site linking back to the fashion index, the one we’ve used since the mid-2000s, then the fashion index would rank highly. This, I thought, was conventional.
With Bing becoming Microsoft’s Wayback Machine and generally failing to pick up anything after 2009, there must be something else going on in search engine-land. After reporting on Google’s failures in January, I see little has improved. I was even able to do a Google search where 10 per cent of the top 50 results were repeated—which beats Bing’s 40 per cent—though I wasn’t able to replicate that for this post. But the issue is that this shouldn’t be happening at all.
Here were Google’s top 10 yesterday for site:lucire.com, with my remarks next to the entries. Like Bing, there’s a page on there that’s never been referred to; if it ever were linked, it would have been accidental (it’s the subdirectory for 2002 articles; we used to put an index.html redirect in those directories in case the pages were accidentally hit due to manual coding). The number of times lucire.com/2002 would have been referenced would be fewer than ten, maybe even fewer than five. But there it is, in third.
There are three framesets from 20 years ago that have made it into the top 10. There is Devin Colvin’s entertainment page from 2004–5 that also has not been linked to in 17-plus years—except by Bing and now, Google decides to make it top-10 prominent.
I’ve no feelings either way for a 2011 and a 2022 article to appear in the top 10, though it’s very, very strange that the top-level pages—pages that are linked throughout the site from articles dating from 2005 and later—don’t appear. They used to in Google.
Google cannot hide behind the excuse that its service has worsened because the web’s content has worsened (a phenomenon, I might add, they created). Here is an existing site, one that has always been in their index (since Lucire pre-dates Google) and it’s doing a terrible job of indexing it and ranking the pages.
Brave, with its few pages, gets it right on a search just for Lucire (we can’t do site: there as that’s powered by Bing). It gives us the print ordering page, the beauty index, the news page, and the travel index (‘Volante’). Mojeek requires a search word so obviously that sways things, but even then it manages to come up with the current home page, ‘Volante’ index, and Lucire TV, which are acceptable. At least they’re current, and currently linked. Today, Bing has fallen to five results for site:lucire.com, its lowest ever, and four of those pages are framesets from 2002.
In fact, it might be time to see how it’s gone for our sample set.
Not great for us. There are some anomalies there, chiefly Google’s estimates of what it has for The Rake, and it seems Lucire’s on our way out of Bing altogether. Mojeek continues to be the most steady, stable and sensible of these three search engines.
If you’re relying on Google or Bing, you really need to think twice. Something has been wrong with Bing for some time, and it’s catching in Mountain View.