Two suggestions to improve "nohtml=1"
Short, concise description of the idea
When an entry is viewed with HTML turned off using "nohtml=1," I think:
(1) ampersands ('&') preceding "lt," "gt," "amp," or "quot" should be escaped so that escape sequences appear as in the entry's actual "source code"; and
(2) the "Don't Auto-Format" option should be ignored, if it is turned on, and <lj-raw> tags should be ignored.
Full description of the idea
For information on ?nohtml=1, see this entry in my test journal. With this background, I think the short description (above) is quite sufficient.
An ordered list of benefits
For escaping ampersands, the advantage is an accurate representation of the HTML; this is useful in copying the HTML to paste it elsewhere, or in distinguishing "sample" HTML from "real" HTML in entries with a good deal of each. However, ampersands should only be escaped in these four cases, because if it's not used with an entity reference (say, if it's used in the middle of a URL) or if it's used with a different one, then it's unnecessary and potentially destructive.
For keeping auto-formatting on regardless of "Don't auto-format" or <lj-raw>, the advantage is that if the code of the entry is "clean" and neatly-spaced, then it should be when the entry's HTML is displayed, too. Also, with hyperlinks, this would allow auto-linkification of the URL that the hyperlink was linking to, so it can still be used.
An ordered list of problems/issues involved
If I make a hyperlink to a page with an ampersand '&' in the URL, and I (correctly) escape the ampersand myself, then this would double-escape the ampersand, ruining the hyperlink. (I think.) To solve this, the escaping might only be done for < and >, but that slightly offsets one of the advantages above. Without making the coding unnecessarily complicated, I see no perfect balance.
Post a comment if you see anything else.
An organized list, or a few short paragraphs detailing suggestions for implementation
For escaping ampersands, this should be rather trivial; just add a bit of code to the nohtml-ification replacing < with &lt;, etc.
For turning off auto-formatting, I imagine this would take the form of changing something like "if(dontautoformat ne 1)" to "if(dontautoformat ne 1 || nohtml == 1"; I don't really know, though.