Refinements for robots.txt
Short, concise description of the idea
I would like the ability to customize robots.txt so that my choices are not all-or-nothing. I'd like to be able to allow Google to index my journal but not other bots.
Full description of the idea
In general, robots.txt can be customized, allowing or disallowing certain bots or allowing indexing of text only but not images, putting certain directories off-limits, and so on. LJ's feature "minimize inclusion of results in search engines" is all-or-nothing: if I want Google to index my posts, I have to accept any indexing. Given current attempts to target LJ users and, in particular, media fans and harvest their information for commercial purposes, see http://nakeisha.livejournal.com/478324.html, I can't keep the convenience and visibility of Google indexing without accepting much more targeted and intrusive indexing.
- Greater control of privacy.
- Greater customizability for users who use LJ as a primary site.
- I have no idea how hard this would be to program. I presume it wouldn't be used by very many people because it is an advanced customization feature, but if targeted data harvesting becomes more common, the feature might increase in popularity.