<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tech Blog &#187; User input</title>
	<atom:link href="http://assanka.net/content/tech/tag/user-input/feed/" rel="self" type="application/rss+xml" />
	<link>http://assanka.net/content/tech</link>
	<description>Just another Arb-assk2009003.turmeric.assanka.com Blogs weblog</description>
	<lastBuildDate>Mon, 29 Aug 2011 19:00:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Validating HTML input in PHP</title>
		<link>http://assanka.net/content/tech/2009/10/10/validating-html-input-in-php/</link>
		<comments>http://assanka.net/content/tech/2009/10/10/validating-html-input-in-php/#comments</comments>
		<pubDate>Sat, 10 Oct 2009 11:51:41 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[DTDs]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[User input]]></category>
		<category><![CDATA[Validation]]></category>
		<category><![CDATA[XHTML]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=38</guid>
		<description><![CDATA[It&#8217;s often the case that as web developers, we need to &#8216;clean&#8217; input from end users to ensure it does not contain any nasty formatting or script that we don&#8217;t want to allow on our sites.  Forums in particular often suffer from either security holes that allow cross site scripting attacks (XSS) or are [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s often the case that as web developers, we need to &#8216;clean&#8217; input from end users to ensure it does not contain any nasty formatting or script that we don&#8217;t want to allow on our sites.  Forums in particular often suffer from either security holes that allow cross site scripting attacks (XSS) or are so restrictive in what they allow to be input that it causes a nuisance to the user (for example, disallowing all HTML but allowing BBCode instead).</p>
<p>This problem is often solved with complex classes or functions in PHP that are designed to strip out the nasty stuff while allowing as much useful formatting as possible.  We realised that these functions are pretty much just reinventing the wheel, because there is already a pretty good mechanism for parsing and validating XML syntax: libxml, which has PHP bindings and can be accessed using SimpleXML.</p>
<p>What&#8217;s more, libxml can parse an XML document for conformance to a DTD, so if you include an XHTML Transitional DTD in your XML code string, you can check that the markup is valid XHTML.</p>
<p>Here&#8217;s the PHP to do this.  This is tested on PHP 5.3 with libxml2-2.6.26-2.1.2.8.</p>
<pre class="brush: php;">
function isXML($str) {
	libxml_use_internal_errors(true);
	libxml_clear_errors();
	$options = (strpos($str, '&lt;!DOCTYPE') !== false) ? (LIBXML_DTDLOAD + LIBXML_DTDVALID) : 0;
	simplexml_load_string($str, 'SimpleXMLElement', $options);
	$errors = libxml_get_errors();
	return (empty($errors) or $errors[0]-&gt;level == LIBXML_ERR_WARNING) ? true : false;
}
</pre>
<p>You could of course use the contents of <code>$errors</code> to feed back to the user, or potentially deal with a validation failure more intelligently, but for now true or false will do.</p>
<p>So the markup submitted by a user is valid.  Excellent.  But just because the markup is valid doesn&#8217;t mean it&#8217;s safe to output to the browser.  You&#8217;ll also want to ensure it contains no <code>&lt;script type="text/javascript"&gt;</code> sections or event handlers, and may want to restrict the set of elements available.  This is where you can start getting creative with your own DTD spec.  Just start with the standard you want to conform to for the whole page (say XHTML) and strip out anything you don&#8217;t like.</p>
<p>We&#8217;ll start by removing the HEAD tag and all its contents.  Our users will not be writing entire documents, just fragments of body markup, so we don&#8217;t want a HEAD, TITLE, or any META tags, etc.</p>
<p>You can continue, removing things like SCRIPT, OBJECT, forms, frames, and so on.  Be careful where elements are defined using presets, which often contain the nasties, for example the <code>%event</code> set of attributes grants an element the ability to fire event handlers.  Fortunately this is almost exclusively used as part of <code>%attrs</code>, so we can just remove it from that superset.</p>
<p>We&#8217;ll also define a new root element <code>fragment_under_test</code> to ensure that we don&#8217;t cause any confusion and lead anyone to believe that they&#8217;re writing a normal <code>&lt;html&gt;</code> or <code>&lt;body&gt;</code>.</p>
<p>Once we&#8217;re done, we can then wrap the isXML function in a convenience function that adds our new custom DTD.</p>
<pre class="brush: php;">
function isXHTMLFragment($str) {
	return isXML(&quot;&lt;!DOCTYPE fragment_under_test system \&quot;http://www.example.com/dtds/xhtml-content-restrictive.dtd\&quot;&gt;&lt;fragment_under_test&gt;&quot;.$str.&quot;&lt;/fragment_under_test&gt;&quot;);
}
</pre>
<p>If you want, feel free to <a href="/content/tech/files/2009/12/dtdfiles.zip">download the DTD I created for this article</a>.</p>
<p>Now you can use the fast libxml to validate user input in a fairly bulletproof way.</p>
<p>Finally, and very importantly, make sure you cache the schemas on your server in an XML catalog file.  If you don&#8217;t do this, libxml will make an external HTTP request for the DTD schema file every time you call the function.  In fact, since most web documents cite W3C DTDs, they are having enormous problems with software making repeated requests for the standard XHTML, HTML 4 etc DTDs which haven&#8217;t changed in years.  Be a good net citizen, and cache your schemas.  In this case we&#8217;re writing and hosting our own anyway, but if you&#8217;re using a public schema you may as well save yourself the pointless HTTP traffic, and it&#8217;ll speed up the validation as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2009/10/10/validating-html-input-in-php/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Auto growing textareas</title>
		<link>http://assanka.net/content/tech/2009/05/04/auto-growing-textareas/</link>
		<comments>http://assanka.net/content/tech/2009/05/04/auto-growing-textareas/#comments</comments>
		<pubDate>Mon, 04 May 2009 16:24:52 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[UI]]></category>
		<category><![CDATA[User experience]]></category>
		<category><![CDATA[User input]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=3</guid>
		<description><![CDATA[This feels like a topic that&#8217;s been explored to death already, but I really don&#8217;t like most implementations of this technique, so here&#8217;s how we do it.
First, in case anyone has just arrived from Mars, or even more unlikely, isn&#8217;t familiar with Facebook, the auto-growing textarea is a text box that gets bigger as you [...]]]></description>
			<content:encoded><![CDATA[<p>This feels like a topic that&#8217;s been explored to death already, but I really don&#8217;t like most implementations of this technique, so here&#8217;s how we do it.</p>
<p>First, in case anyone has just arrived from Mars, or even more unlikely, isn&#8217;t familiar with Facebook, the auto-growing textarea is a text box that gets bigger as you type into it, so that you never see a scroll bar (unless you&#8217;re typing war and peace).</p>
<p>There are lots of ways of doing each part of this.  The parts I&#8217;ll look at are (a) what triggers a review of the text box&#8217;s size, (b) how to determine whether a resize is required, and (c) how to perform a resize.</p>
<h2>Triggering a review</h2>
<p>There are four ways this is typically done:</p>
<ol>
<li> On an interval, say every 50ms</li>
<li> On keypress</li>
<li> On change</li>
<li> On keyup</li>
</ol>
<p>I&#8217;m staggered by how many scripts insist on doing this kind of thing on an interval.  Setting up interval timers for no good reason is a shortcut to terrible performance and memory leaks galore.  So we won&#8217;t be doing that.  onkeypress and onchange events are triggered before the box is updated with the latest keypress, so we want to avoid those, as the latest keypress might have been the one to bring us onto a new line.  That leaves onkeyup, which is fired after the box is updated with the new character, and allows us to inspect it and decide whether to increase its size.</p>
<h2>To resize or not to resize</h2>
<p>How to determine whether to resize the box comes in three flavours:</p>
<ol>
<li> Count the number of newline characters in the textarea&#8217;s value, and see if that matches the number of rows the textarea has.</li>
<li> Create a &#8217;shadow&#8217; DIV which is off screen (eg. margin-left: -10000px) and has no height declared, but otherwise has the same style properties as the textarea, fill it with the textarea&#8217;s content, measure the subsequent height of the DIV, then see if that height matches the current height of the textarea.</li>
<li> Check whether scrollHeight (the height of the scollable content of the box) &gt; clientHeight (the height of the box itself).</li>
</ol>
<p>In this case I was suprised at the number of implementations that favoured counting newlines.  This only works if combined with counting the number of characters in the textarea&#8217;s value, AND a monospaced font, AND knowing the number of columns in the textarea.  In short, er, it&#8217;s mad.</p>
<p>Second option, and the one favoured by a lot of the framework plugins due to the ease with which you can create shadow elements in the likes of jQuery, is to create a shadow DIV. This has the advantage of telling you the actual pixel height of the text, even where it is LESS than the height of the textarea box.  Otherwise, you&#8217;re limited to measuring clientHeight and scrollHeight, which are exactly the same if the textarea isn&#8217;t scrolling, regardless of how much space you&#8217;ve got to spare at the bottom.  My issue with this method is that it basically requires use of a framework to not be painful, and even then it&#8217;s non-trivial amounts of code, and adds needless pollution to the DOM.</p>
<p>So that leaves relying on scrollHeight and clientHeight.  These are well supported and efficient to read, so provided that you work around the issue of scrollHeight always being at least equal to clientHeight, this offers a very lightweight solution.  The features that you can achieve with a shadow DIV that you can&#8217;t do by simply reading scrollHeight and clientHeight are (a) you can ensure there is always a blank line at the bottom of your textarea, and (b) you can shrink the textarea if the user deletes text from it.  I&#8217;m personally of the view that neither of these is actually particularly desirable.  There&#8217;s potentially an argument for leaving a blank line at the bottom, but equally the uer might feel like they&#8217;re just being pressured to write more.</p>
<h2>How to resize the box</h2>
<p>OK, so if you&#8217;ve concluded that the user is writing chapter and verse and the textarea is in need of a bit more space, how do we go about doing it?</p>
<ol>
<li> Animate the CSS height property</li>
<li> Add some height via the CSS height property</li>
<li> Add 1 to the rows attribute</li>
</ol>
<p>A fair few of the framework plugins use their framework&#8217;s animation capabilities to animate the grow effect.  I don&#8217;t like this.  Just because you have an animation effect available doesn&#8217;t mean it&#8217;s appropriate to use it, and there are many situations where the end user just doesn&#8217;t want to wait 300ms for the privilege of using a fresh line.</p>
<p>Simply adding height to the CSS property is a fair way of doing it, but unless you do some maths on the user&#8217;s line height (or hard code some magic numbers), you can&#8217;t necessarily guarantee that the resulting height will be a multiple of the textarea&#8217;s line height.</p>
<p>Easiest, most efficient solution: add one to the rows attribute of the textarea.  rows is part of XHTML as well as HTML 4.01, and has universal support going back yonks.</p>
<h2>Pasting</h2>
<p>Watch out for pastes.  If the user pastes in a large quantity of text, they will trigger only one keyup event, but will have added many lines to the textarea.  Make sure that if a resize is required, you trigger another review after the resize is complete.</p>
<h2>Max height: the war and peace scenario</h2>
<p>If the user really does seem to be writing a novel, we probably should call it a day on growing the textarea at some stage, and certainly <strong>before it gets to be taller than the viewport</strong>.  You can check the height against the viewport height, though I typically just restrict it to an arbitrary height, say 30 rows.</p>
<h2>The code</h2>
<p>You&#8217;ll need a textarea element with an ID of <code>mytextarea</code> to make this code sample work, and obviously you can easily modify it to use selectors from your favourite framework rather than the native <code>document.getElementById</code>.</p>
<pre class="brush: jscript;">
document.getElementById('mytextarea').onkeyup = function() {
	var ta = document.getElementById('mytextarea');
	var maxrows = 30;
	var lh = ta.clientHeight / ta.rows;
	while (ta.scrollHeight &amp;gt; ta.clientHeight &amp;amp;&amp;amp; !window.opera &amp;amp;&amp;amp; ta.rows &amp;lt; maxrows) {
		ta.style.overflow = 'hidden';
		ta.rows += 1;
	}
	if (ta.scrollHeight &amp;gt; ta.clientHeight) ta.style.overflow = 'auto';
}
</pre>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2009/05/04/auto-growing-textareas/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

