Tumblr tagging and why it’s a problem on non-Tumblr sites
Seriously, though, have fun with the tag commentary if you like. But please try not to do that outside Tumblr. It doesn’t matter on this site, because its tags policy is crap anyway, but on other sites tag metadata is important to data organization and preservation, and users’ ability to navigate and hold conversations.
It costs sites like AO3 a lot of time, effort and money to clean up from Tumblr-style tagging.
SIGNAL BOOST.
This librarian says please yes please to keeping Tumblr tags on Tumblr. Don’t kill the metadata signals with noise, mmmkay?
So, this topic keeps crossing my dash and every time I thoroughly sympathise with the beleaguered tag wranglers and the logic and wanting tags to be categories and all that.
But.But, in my experience, telling people how to use interactive platforms and trying to stop these sorts of cultural evolution things has a very small chance of success. I mean, it’s not like that stuff makes SENSE on Tumblr either. I found it massively confusing at first.
But I’m afraid that it has become a subcultural quirk we have to live with and engineer around, however painful that may be.
I wonder if there is a work around, such as connecting all the “commentary tags” to a meta tag?
The most sensible thing would probably be to throw out the box in which we free-type tags and make lists of existing tags instead to which new tags can be added after someone looks at them or something?
Nope, Frytha. Sorry but no. That’s exactly the point. All that tag-connecting you’re talking about? Has to be done by hand. Computers can’t process meaning on that level. That IS what they’re doing now and that’s exactly what’s taking so much time and energy.
Bottom line: screwing with the tags that way is flat-out abusing and vandalizing the system, for most sites.
And we have one way to engineer around it: we can make the tags go away. A lot of sites—and not just AO3, that’s just my go-to example—will become a lot less fun if they have to rip out their user-defined metadata setups and install controlled metadata vocabularies.
I’m not deluding myself that the majority of crazy Tumblr-taggers are going to come across these posts and go, “Oh, okay!” But seriously, it’s not okay. When you take that particular behavior out beyond Tumblr, YOU ARE CREATING A PROBLEM. You’re doing DAMAGE. What’s fun and frivolous for you is costing other people time and money—often more time and money than they can afford.
What those commentary tags are for many sites is very expensive graffiti. I know that internet users seldom give much thought to the resources it takes to maintain their networks and favorite sites; it’s a pretty complex subject, and most users have other things to do with their time than educate themselves on the underpinnings of their favorite sites. But what they need to understand is that those resources aren’t infinite, and they are not cheap.
Commentary tags require categorization by actual people, they require space in the databases (which bloat ridiculously with thousands upon thousands of one-off tags), and they require network and bandwidth resources every time a story with one of those tags is called up to read or when those tags are clicked on or otherwise searched.
Site owners accordingly create policies that are meant to keep the place enjoyable and affordable for all the people who use it. When you blow off those policies (such as ‘Please tag this way, look we’ve even designed tutorials if you’re confused’) you’re crapping all over everybody else’s experience.
So usually I’m a big fan of ‘Do what you like as long as it isn’t hurting anybody,’ but THIS IS HURTING PEOPLE. AO3 is my go-to example, but it’s far from the only one. Libraries, wikis, Flickr, other social media sites… User-defined metadata is an expensive choice that these sites have made for YOUR convenience. Don’t abuse it.
Also, superhappygenki, who is an AO3 tag wrangler, has some how-to information and a tutorial, if you want one.
I didn’t reblog this when it was going around but I’m feeling inspired to by the fic I saw tagged “Isaac likes grilled cheese” on AO3.
I am rolling my eyes at people saying that tagging your stuff on social media websites whichever way you want is ~abusive~, no it’s not - you make the choice of how you want to tag your stuff based on how you want people to find your stuff, if you want them to be really hard to find, well that’s your choice? actual databases like libraries aren’t tagged by random users, they’re tagged by librarians and volunteers who work with clearly defined and strict tagging systems like the library of congress subject headings - something which is clearly more expensive than letting random users tag thus something websites like AO3, flickr etc for all their dedication to accurate tagging would never do - tagging on social media websites isn’t meant to be rigorous, it’s meant to be fun and a way for people to find your stuff
ETA: okay maybe I am saying this because in terms of tagging and accessibility the biggest problem with online texts and online texts databases is that they’re in outdated html so they’re very hard to transfer to other platforms / work with (e.g. convert into an ebook) - this is especially a problem with poetry and plays / texts in slightly unusual formats and I have been tagging Aurora Leigh in xml for two weeks and it’s really boring
I’m not sure how I feel about calling AO3 a social media website per se, but I’ll leave that aside.
the above posts definitely aren’t perfect, but the issue with AO3 specifically is that they’ve already had to disable tag searches once (for several months) when their membership first exploded in 2012 because there were too many new tags that they had to sort through. not having access to tags as a sorting mechanism was really annoying and made the site much more difficult to interact with (and they still haven’t been able to restore all of the old functionality, because with people now tagging their fics things like “Isaac likes grilled cheese” showing every tag when you’re filtering by tag is no longer feasible), and I don’t want them to have to do it again.
and AO3 is volunteer-run, so if this kind of maintenance gets too time-consuming they might have to.
there are probably other ways they could address this problem, and maybe they are, but in the meantime people posting their fics on AO3 can make their lives less difficult and help maintain the utility of the service.
I think it’s important to understand what tags do to a database if they are messy. Tags are what’s called a many-to-many relationship, because each post is associated with many tags (potentially), and each tag with many posts (potentially). These kinds of relationships can lead to extremely large databases, which in turn makes actually using those databases extremely slow and painful. When you have lots and lots of “junk” tags, they’re bloating the database, because as far as the software knows they’re just ordinary tags, so it has to treat them the same way it treats useful tags. As the number of tags grows, the complexity of the database grows very quickly. A site like Tumblr may have the money to mitigate the problems of database complexity (faster hardware, bigger and smarter caches, etc) but not everyone can. In fact, even Tumblr probably has issues, which I would bet is the reason for a lot of the strange things about Tumblr tagging (ie only the first 5 being searchable).
Speaking purely for the AO3, some points of friendly clarification. :-)
The kind of one-off commentary tags that are frequently referred to as Tumblr-style tags do not put any kind of extra strain on the database, or require more work from the wranglers than any other Additional Tag, such as Romance or Angst or Pretzels. Even the fact that there are a lot of them isn’t really an issue. The biggest contributors to our database size are the works themselves (or chapter content, technically), the reading history we keep for every user, and all the kudos people have left since December 2010 (20 million and counting).
Purely from a technical standpoint, even thousands of one-off tags are a drop in the ocean, given all the other stuff. If you use a lot of them to describe your fanwork, they might make the information in your work blurb harder to parse and/or annoy fellow users, but that’s a different story and a different problem for the coders to handle.
In general, code complications and performance issues arise from the more or less objectively “useful” tags, such as Fandoms and Characters as well as canonical Additional Tags, especially if they’re heavily interlinked (cf. Mythical Beings & Creatures). Keeping track of metatags, child tags, synonymous tags (all the tags!) when filtering works, or nominating fandoms for a challenge, or otherwise interacting with the tagging system in any way - that’s what’s technically hard.
But Tumblr-style tags that don’t seem useful (outside the particular work they were used on) can just be chucked into the metaphorical void that holds the non-canonical or unwrangleable tags. Or they can be neatly grouped together under a helpful common keyword, such as Feels. (See also: the podfic version.) It doesn’t create more work for the wranglers than the simple act of wrangling already does. And trust us: the wranglers really, really like organizing your tags in the background.
And this is just a minor point, but for completeness’ sake: tag filtering had to be disabled for four months in 2012 (*memory cringe*) because the old code that created the filtering options just couldn’t handle the dramatic increase in site usage - and our servers were melting. (This AO3 news post has a few more details on that. Also, we’ve upgraded our servers since then.) There was no point in trying to apply some spot fixes here and there; it had to be completely ripped out and rewritten from scratch. And not only for the filtering sidebar, but for other site areas that relied on the new underlying search engine as well (elasticsearch, for the nerds among us). Then it had to be tested and tweaked a lot before we could release it into the wild.
The whole process turned out to… require a little more time than we’d hoped, but it had nothing to do with any kind of tag in particular. There was just a lot of everything and then suddenly a lot more of it.
There’s more resources to be found in this post by nannaluna further down or up this reblog chain, and if you have any more questions, you can always contact Support who are the nicest. :-)

