[NOTE: This essay was originally published in March of 2021, so some of the facts cited have changed since. Also, in retrospect, my critique of Amazon’s poor metadata application and lack of effective search functions was naïve, and not critical enough, in my opinion, after learning that the apparent inefficiencies are by design and commercially conducive to the platform. Stay tuned for Part Two.]
Let’s assume that I’m using “horsewhip” metaphorically and mainly to underscore (ouch!) the tenor of my argument.
A lot gets said about Jeff Bezos, good and bad. In my former life as an electronic publishing-focused trade journalist and later as an analyst of digital publishing technologies and business processes, I got to hear Mr. Bezos at many conferences starting back in the mid-1990s and on, and for years the word on Amazon was that the company made no money selling books online. Obviously, this market building has paid off over the decades to the tune of Jeff Bezos’ collective wealth poking around the $200 billion mark, and that is even after his very expensive divorce.
The company itself has plenty of money, too, including cash on hand. According to Macrotrends, which self-describes as “The Premier Research Platform for Long Term Investors,” Amazon cash on hand for the quarter ending September 30, 2020 was $68.402B, a 57.6% increase year-over-year.
I’m putting aside the entirely obvious problem with any one person and any one company having so much money and the equally obvious need to address this state of affairs through better progressive income tax and corporate taxes; there’s plenty of talk about this societal absurdity elsewhere. What there isn’t much talk about is the lousy job Amazon does in its core business. Considering how much money the company makes, it seems absurd to criticize Amazon on this point, but hear me out.
First, let me spell out that Amazon has a number of core businesses, including Amazon Web Services (AWS), reportedly the biggest giant server farm service (a.k.a. cloud platform) in our known galaxy, although Google Cloud is very big, as is Microsoft Azure and similar efforts by Apple and Facebook: they all reside on Mt. Olympus. I’m not going to address cloud services other than to mention that in 2020 the money taken in across the board by this sector was over $50 billion, and that AWS has about a third of market share, with the point being lots of money coming in from that part of the company’s effort.
Plenty of money comes in from other parts of Amazon’s businesses, including, no surprise, its catalog service, which is what most people think of when they think of Amazon and what most people use Amazon for, whether to buy books online or a pair of chinos or a television show. Starting as an online book seller in 1994, this part of Amazon is now the digital world’s largest and best-known Sears and Roebuck Catalog.
I don’t want Jeff Bezos horsewhipped because he owns and runs the digital world’s largest and best-known Sears and Roebuck Catalog. I want him horsewhipped because he is doing such a piss-poor job of it.
I had been involved in digital publishing for over three decades, coming out of book publishing on the editorial side, so I’ve taken special interest in Amazon and was there for all three generations (third time was the charm) of the conferences and promises about e-books and the chronic hand-wringing of print publishers about digital forms of books. I’m not interested in re-litigating the old print-versus-digital arguments, although my complaint about Amazon’s failures does include negative consequences for books. What I want to talk about is the poor—and getting worse—job Amazon makes of its core business, which I’ll describe as the business of helping people find things they buy.
People looking for something they want to buy—even if they intend to shop locally—often start with Amazon, although some people start with Google or other general search engine, and it is not germane to this discussion which approach is most prevalent. What is germane is that finding things through Amazon—whether light bulbs or books—has gotten harder and less effective. The search function of Amazon has been decaying over the years.
One of those shining bright promises of the digital world is constant improvement, whether in better operating systems, or faster or smaller or cheaper computers and smart phones, or a myriad of other technologies or services. Amazon has failed to meet this common expectation even while earning and holding on to tens of billions of dollars a year, which in my perhaps simplistic view of economics suggests to me that Amazon could in fact afford to do a better job. Amazon’s search interface has gotten harder to use and Amazon’s application of cataloging metadata has gotten sloppier and sloppier.
What’s with “metadata,” right? Metadata is data about data, and in the case of catalogs, the data would be the name of the object being cataloged and the metadata is information about that object, including things like part and serial number, category of object (e.g., “light bulb”), sub-category (e.g., LED), source information (e.g., brand and manufacturer of the object), and other information about color, shape, size, and so forth and so on, so that the thing can be searched on and otherwise managed for such actions as pricing, inventory, shipping, and billing. Humans have been cataloging things for at least 200 years, and that’s if you only consider commercial applications. When you look in the fields of science and philosophy, cataloging has been going on for many centuries, and if you don’t believe me, maybe you should ask some other Homo sapiens Linnaeus who is professionally involved with taxonomies.
Let me tell you about a recent experience I suffered through when buying a light bulb online through Amazon.
I knew what I wanted, which was an LED Para 16 (a specific bulb base with two stub-style pins, perhaps most common in track lighting), non-dimmable (LED bulbs are typically dimmable or non-dimmable and if you use a non-dimmable version in a lighting fixture with a dimmer, then welcome to constant flickering that both reduces the lifetime of the bulb and drives you crazy), and 50-watt equivalent warm light (which is Kelvin-rated between 2400-3500). I didn’t care about brand, but I did have search results ordered as “lowest-to-highest price,” being the cheapskate that I am. I did know what I was looking for and I entered the search terms accordingly.
I got over 40 pages of results.
Those 40-plus pages of search results weren’t well focused on what I had searched on, with many of the results reporting back halogen 50 watt bulbs and/or different bulb bases and/or dimmable and non-dimmable bulbs alike, and other radically different forms of LED lights such as those “rope” lights that sure as hell weren’t going to fit into the bulb sockets I had. So, like any good digerati, I looked to see what further search criteria was available to me to tighten the search, and on Amazon this is found in a left-hand column, but in this case this was largely about brand selections. And have I mentioned that the first rows of each search result page presented “Sponsored” product links that often had little to do with what I was looking for, and certainly didn’t present the lowest priced results up top?
Basically, Amazon isn’t doing its metadata homework, and obviously, it is not for lack of resources. Amazon’s basic search fails basic functionality, and its advanced search option—which I’d guess most people—like 98% or so—have never looked at, and good on them because that sucks too. There are many decades of search and retrieval techniques and developments, including something called Boolean that allow a searcher to add, define, exclude, and order the sequences of search sets, and named for George Boole, the man who invented Boolean Logic in the 19th Century.
I suspect that Amazon isn’t doing its metadata homework because it doesn’t have to because returning poor search results probably makes Amazon money such as through the income from ads, which in the case of Amazon is those “Sponsored” products. I suspect that Amazon doesn’t care if you don’t find exactly what you are looking for because they know you are still likely to buy something close to what you want, and no doubt many users assume they’ve gotten as close to what they want because, of course, why wouldn’t the exact item come up from their search, so it is probably not available, and what the hell, let’s settle. Amazon is actually even more nefarious in that they use search results to determine market interest and then Amazon follows up on that interest by competing directly with their supply-side customers (the companies making or managing items for sale on the site), through the Amazon Marketplace, Amazon Basics, and Amazon Essentials programs, where they contract production of high interest items and then sell their own brands directly, and typically at lower cost, and in my experience lower quality; this is part of the antitrust and anti-competitive arguments being levied against Amazon even as we speak, but maybe I’ll cover that in Part 2 or something.
One might argue that Amazon, with its always expanding catalog, is always playing catch-up with metadata, since there are new items coming in the door all the time. On the other hand, if their experience with books is useful prologue, then Amazon has already put the onus of metadata on the on the supply side. Amazon, in the case of books, has a standard for the required metadata for content title submissions into the Amazon system in the form of the Amazon version of ONIX, defined by Book Industry Study Group as follows:
ONIX is an acronym for ONline Information EXchange. ONIX for Books refers to a standard format that publishers can use to distribute electronic information about their books to wholesale, e-tail and retail booksellers, other publishers, and anyone else involved in the sale of books.
It has been quite a while since I’ve been active in digital publishing (hey, in my day, I gave at least one conference talk and one or two webinars for Book Industry Study Group), and what I mainly remember was that ONIX had de facto a minimum of two versions of this standard—one for Amazon, which switched a few things up, and one for everyone else who adhered to the published standard, and that helping publishers figure out how to implement the differencing versions became something of an industry in its own right and may still be one to this day. My point is that Amazon understands the value of standards for metadata ingestion, even if it is its own version! I couldn’t image that they don’t, and a moment’s online research shows me that Amazon indeed has an extensive “Category and Metadata” requirements for manufacturers and resellers, which means that Amazon makes the suppliers do it, but how well Amazon uses the information or carries out quality control of the metadata is anyone’s guess.
But I’m familiar with the book side of their business and I have every reason to believe that Amazon continues to enforce the use of ONIX metadata. Yet my experience using Amazon over the last few years is of poorer and poorer book search performance on the part of Amazon’s search engine. It ain’t as good as it used to be and seems to continue to deteriorate.
I periodically spend time looking on Amazon at the state of the science fiction genre and some of its sub-genres. Years back, Amazon had many authoritative lists of science fiction genre titles and sub-genre titles, and by authoritative I mean a variety of people who authored critical essays and lists and reviews of what they thought were best within a defined science fiction category. These efforts are largely gone, and the few that remain are easy-to-assemble lists such as The New York Times Best Sellers and a few other best-of lists such as Nebula or Hugo award winners. My guess is that Amazon doesn’t want to pay for the time and effort that useful critical lists take, although, to reiterate, at one point years back they had made such an effort. Despite holding $50 billion or so in cash, damn them if they will support substantive quality guides to books.
Unfortunately, these days, there are few other such resources in the non-Amazon world. A few of the big science fiction magazines still exist, but their collective number of book reviews remains small. There are many online science fiction ‘zines, but they quickly come and go, just as the quality is up and down. There have been people generating solid critical work on science fiction over a long time now, but I’d guess the typical online lifespan for any particular useful sort of effort is well under a year and I can’t blame those people who try to do this—reviewing books is time intensive and there is little direct revenue to be had from this kind of effort. What once had been an entire ecosystem in book publishing which asserted quality control—namely editors and reviewers—has for the most part disappeared even among the few “big” publishers of science fiction left, with the gap in title numbers more than filled through self-publishing.
I have nothing against self-publishing, but the lack of quality control is problematic for the reader, and this may be especially true for science fiction. If you look at Amazon Books under the genre of science fiction, and limit search results to books published in the “Last 30 days,” you’ll see that somewhere close to 3,000 science fiction books are published each month and the vast number of them are self-published, and this very often—far too often—means that you don’t have any editorial gatekeeping contributing to quality control. And things seem to have gotten even worse: I just undertook the search exercise described earlier in the paragraph–Amazon Books, Science Fiction, published in the last 30 days, and with no ratings selected, and the results were “Pages 1-16, over 10,000 items.” Hey, maybe those Pare 16 LED bulbs are listed there!
So how do you find a science fiction book you’d like reading? Well, Amazon is not much help. On the face of it, it looks like Amazon is doing a great job—after all, there are 20 sub-genre breakouts (e.g., Adventure, Alien Invasion, Alternate History, Anthologies, and that is just the “A’s”), and there is cover art and the “Look Inside” feature that provides a sample of the content for most of the books. And isn’t there a review system that is crowdsourced, and what could be better than that, right? Click on “Four stars or higher,” right?
Well, if you do exactly this, you’ll still be looking a hundreds if not thousands of books, and that’s if you are only looking for very recently published works. It would seem to be incredibly useful to be able to search for books with a set number of four-star reviews, but your search result will show any and all books with a four-star rating, whether that rating is truly crowdsourced (e.g., say 500 reviews) or is the result of some author or his or her mother supplying the sole review. It isn’t like Amazon doesn’t have the count of reviews for every title. You can see the number of reviews for any given book that returns a search, but you’re still left with one hell of a lot of scrolling, and now you need to scan for the stars, as it were. But why not have this as a search criterion, as in “only four-star or higher reviews with a minimum of X reviews”? In the absence of authoritative guides, Amazon users don’t even have a mechanism to use crowdsourcing as an on-the-fly authoritative guide, except book by book, search result page after search result page.
There is the problem of scant quality control over the ratings (meta-quality control, if you will), as anyone browsing books on Amazon will attest. A recent search under Science Fiction, “last 30 days” and “four-stars or above” returned about 1,000 science fiction books. When I selected “last 30 days” and “one-stars or above” I got a search hit for about two-thousand books, which means, best I can figure, that half the books are fours stars or better and therefore we must be living in the golden era of science fiction publishing.
Were that the case, but alas, I don’t think so. What I do know for certain is that Amazon, despite its monopolistic position that exerts great influence on our culture, has failed to live up to its responsibility of helping its customers find what they want to buy. It sounds crazy, but then again, Amazon is already making megatons of money, so it mustn’t be too badly broken, so don’t fix it, right?
But really, this is the result of laziness or sloppiness or worse. Clearly there are funds available to improve the accuracy and effectiveness of the search interface, but instead the search function gets worse, not better. And that is why Jeff Bezos should be horsewhipped, or one of the reasons, anyway.
Metaphorically, of course.
I see what you mean.
“1-16 of over 2,000 results for “la law”
I ought to be done browsing through this listing sometime next year.