Torgo χ (torgo_x) wrote in suggestions,
Torgo χ

Smarter RSS rejection

Short, concise description of the idea
When initializing a new syndication, make sure that the source is actually RSS/RDF (not GIF/HTML/etc)

Full description of the idea
I've seen a lot of bad feeds created by users who don't know that the syndication-source URL has to be RSS/RDF, and so just feed in whatever URL they want to watch, regardless of it being to an HTML object, or GIF, etc.

My suggestion is to make the RSS-poller here reject and delete any feeds whole initial fetch either fails, or returns something htat doesn't start with a "<" character. It's just two or three lines of code, but it'll stop the current nastiness of the RSS-harvester constantly trying to harvest non-RSS content (and using your bandwidth, etc)

An ordered list of benefits

  • Saves LJ bandwidth
  • Recovers gracefully from erroneous feed creation.
  • An ordered list of problems/issues involved

  • Requires poking around in the Perl.
  • All well-formed XML files start wiht a "<", correct?
  • An organized list, or a few short paragraphs detailing suggestions for implementation

  • undo_feed_creation_dangit() unless $resp->is_success and $resp->content =~ m/^\s+
  Tags: syndication, § implemented
