<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tech Blog</title>
	<atom:link href="http://assanka.net/content/tech/feed/" rel="self" type="application/rss+xml" />
	<link>http://assanka.net/content/tech</link>
	<description>Just another Arb-assk2009003.turmeric.assanka.com Blogs weblog</description>
	<lastBuildDate>Mon, 29 Aug 2011 19:00:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>FastClick: native-like tapping for touch apps</title>
		<link>http://assanka.net/content/tech/2011/08/26/fastclick-native-like-tapping-for-touch-apps/</link>
		<comments>http://assanka.net/content/tech/2011/08/26/fastclick-native-like-tapping-for-touch-apps/#comments</comments>
		<pubDate>Fri, 26 Aug 2011 08:44:46 +0000</pubDate>
		<dc:creator>Matt Caruana Galizia</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[events]]></category>
		<category><![CDATA[fastclick]]></category>
		<category><![CDATA[html5]]></category>
		<category><![CDATA[iOS]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[touch]]></category>
		<category><![CDATA[UI]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=227</guid>
		<description><![CDATA[The JavaScript APIs for handling touch events and gestures in JavaScript are simple and intuitive enough to grasp on first try, and the increasingly excellent support for web standards in Webkit browsers means we can create highly touch optimised web apps that look and feel like native apps.   But that doesn’t mean there [...]]]></description>
			<content:encoded><![CDATA[<p>The JavaScript APIs for handling touch events and gestures in JavaScript are simple and intuitive enough to grasp on first try, and the increasingly excellent support for web standards in Webkit browsers means we can create highly touch optimised web apps that look and feel like native apps.   But that doesn’t mean there won’t be any hitches when you get deeper into web app development, and one of those hitches only becomes apparent when users stop thinking of your app as a website.</p>
<p>Probably because touchscreen devices weren’t nearly as popular in 1995 as they are today, JavaScript DOM events tend to reflect the mouse ‘click’ paradigm, where each event neatly corresponds to a single, deliberate click of the button. Take the onClick event, for example. How does that translate to a touchscreen device? Not easily, as it turns out.</p>
<p>Treating a ‘tap’ as a ‘click’ is the practical approach. On iOS at least, there’s no onTap event &#8211; <code>onClick</code> has been repurposed for that role. But in order to properly handle multiple-touch gestures like pinching or even double-taps, some compromises have to be made. One of these is the roughly 300ms delay between a tap and the firing of a click event, which can make your apps feel laggy and unresponsive even when it’s not technically so.</p>
<p>The technique we use to get around this problem is to track all <code>TouchStart</code> events in our app and fire a click event as soon as we receive a <code>TouchEnd</code> event (unless some application-specific exception is satisfied).</p>
<p>As we refined our fast-clicking code, we turned it into a small and efficient library, which we call <strong>FastClick</strong>. This code has been tried and tested by hundreds of thousands of users, and so far has proved to be very robust.  We’d love to know what others are doing to address this challenge and find ways of improving our own approach, and to help kick off that process, we’re open-sourcing FastClick today.  We’d encourage you to try it out, and let us know what you think.</p>
<p>To use FastClick, instantiate it on the layer you’d like to be fast-clickable &#8211; we use <code>document.body</code> because we want all our buttons and links to receive fast clicks. In your event listeners, ‘click’ events synthesised by FastClick will have the <code>forwardedTouchEvent</code> property set to true.</p>
<p>If you use buttons and iOS-style menus in your app then there’s a good chance your interface feels unresponsive on touchscreen devices. Here’s a simple example of how we solve that problem for a single button by instantiating FastClick on it:</p>
<pre class="brush: xml;">
&lt;button class=”fastclick” onclick=”someHandler()”&gt;Fast click&lt;/button&gt;
&lt;button onclick=”someHandler()”&gt;Slow click&lt;/button&gt;
&lt;script type='text/javascript'&gt;
var button = document.querySelector(&quot;.fastclick&quot;);
new FastClick(button);
&lt;/script&gt;
</pre>
<p>In the example, the button with a <code>fastclick</code> class will have its click handler called as soon as your finger is lifted off the screen, while tapping on any other buttons on the page will trigger the same handler after a noticeable delay. Try the <a href="http://assanka.net/content/tech/files/2011/08/fastclickdemo.html">live demo</a>.</p>
<p>Unfortunately, the <code>select</code> element doesn&#8217;t behave normally when receiving (synthesised) programmatic clicks, so if you apply FastClick to an element that contains selects, FastClick will ignore clicks on the select and allow the normal click event to fire.</p>
<p>If you want any other elements (besides selects) in your FastClick layer to receive non-programmatic clicks, you&#8217;ll need to use one of two classes: <code>clickevent</code> or <code>touchandclickevent</code>. For any clickable element in a FastClick layer, tapping the element will cause different effects depending on how you use these classes:</p>
<ul>
<li>No class: The element will receive only a programmatic click from FastClick.  The default click event triggered by the user will be suppressed.</li>
<li><code>clickevent</code>: The element will receive only the default click event, and will be ignored by FastClick</li>
<li><code>touchandclickevent</code>: The element will receive both the default click event AND a programmatic click from FastClick (the FastClick one will be triggered first). This is only safe if your handler&#8217;s action is idempotent.</li>
</ul>
<p>Here&#8217;s an example of all three:</p>
<pre class="brush: xml;">
&lt;div class=&quot;fastclick&quot;&gt;
	&lt;button onclick=&quot;someHandler()&quot;&gt;
		Will receive programmatic click event
	&lt;/button&gt;
	&lt;button class=&quot;clickevent&quot; onclick=&quot;someHandler()&quot;&gt;
		Will receive non-programmatic click event
	&lt;/button&gt;
	&lt;button class=&quot;touchandclickevent&quot; onclick=&quot;someHandler()&quot;&gt;
		Will receive both click events
	&lt;/button&gt;
&lt;/div&gt;
&lt;script type='text/javascript'&gt;
var button = document.querySelector(&quot;.fastclick&quot;);
new FastClick(button);
&lt;/script&gt;
</pre>
<p>When is this useful? Try the <a href="http://assanka.net/content/tech/files/2011/08/fastclickdemo-input.html">other live demo</a> we&#8217;ve built using a click event handler that attempts to trigger focus on an <code>input</code> element. iOS will only allow focus to be triggered on other elements within a handler function if the event that triggered it was non-programmatic.</p>
<p>We’ll be posting updates and answering questions on this blog. If the interest reaches a stage where FastClick could benefit from the developer community, we’ll move to a public repository host. But for now, concentrate on giving your users the best response time they can get &#8211; <a href="http://assanka.net/content/tech/files/2011/08/fastclick.js">download FastClick</a> (or <a href="http://assanka.net/content/tech/files/2011/08/fastclick.min_.js">minified</a>) and give it a go. Its free to use in all your apps, and licensed under the <a href="http://www.opensource.org/licenses/mit-license.php">MIT license</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2011/08/26/fastclick-native-like-tapping-for-touch-apps/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Tips for better fragment navigation</title>
		<link>http://assanka.net/content/tech/2011/02/16/tips-for-better-fragment-navigation/</link>
		<comments>http://assanka.net/content/tech/2011/02/16/tips-for-better-fragment-navigation/#comments</comments>
		<pubDate>Wed, 16 Feb 2011 10:05:07 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[ajax]]></category>
		<category><![CDATA[fragments]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[jQuery]]></category>
		<category><![CDATA[keyboard]]></category>
		<category><![CDATA[Scrolling]]></category>
		<category><![CDATA[UI]]></category>
		<category><![CDATA[User experience]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=170</guid>
		<description><![CDATA[Fragment navigation is becoming more and more popular.  Facebook practically runs their entire site on it, twitter search uses it, and Google recently defined a standard for making it crawl-able.  JavaScript framework evangelists have rushed to produce plug-ins for their favourite tool-kit to make fragment navigation easier to implement.  Here I&#8217;ll discuss [...]]]></description>
			<content:encoded><![CDATA[<p>Fragment navigation is becoming more and more popular.  Facebook practically runs their entire site on it, twitter search uses it, and Google recently defined a <a href="http://code.google.com/web/ajaxcrawling/docs/getting-started.html">standard for making it crawl-able</a>.  JavaScript framework evangelists have rushed to produce plug-ins for their favourite tool-kit to make fragment navigation easier to implement.  Here I&#8217;ll discuss some of the issues we&#8217;ve dealt with as we started to use fragment navigation more often.</p>
<p><img src="http://assanka.net/content/tech/files/2010/08/new-1.png" alt="" title="Facebook example of fragment navigation" width="590" height="72" class="alignnone size-full wp-image-178" /></p>
<p>The details of how fragment navigation (also hash navigation or hash fragment navigation) works have been <a href="http://www.novatek.com.au/news/fragment-navigation">covered</a> <a href="http://yensdesign.com/2008/11/creating-ajax-websites-based-on-anchor-navigation/">extensively</a> by <a href="http://msdn.microsoft.com/en-us/library/cc891506(VS.85).aspx">others</a>, and have been abstracted into various frameworks and toolkits.  A number are available for jQuery, such as:</p>
<ul>
<li><a href="http://plugins.jquery.com/project/history">History</a></li>
<li><a>BBQ</a> (<a href="http://mattfrear.com/2010/03/20/enabling-browser-back-button-on-cascading-dropdowns-with-jquery-bbq-plugin/">tutorial</a>)</li>
<li><a href="http://www.asual.com/jquery/address/">Address</a></li>
</ul>
<p>However, fragment navigation comes with a new set of challenges that we found ourselves having to address, so these will be the focus of this post.</p>
<h2>Full page alternatives</h2>
<p>Rather than changing all links to the form <code>&lt;a href="#!fragment/path/for/ajax"&gt;Link&lt;/a&gt;</code> and then detecting and acting upon fragment changes in javascript, we wanted to ensure that our links worked when they were copied and pasted into a new browser window, shared by email, or used with javascript disabled.  This meant making them into real URL links, and then progressively enhancing using a Javascript onclick handler to cancel the normal navigation action:</p>
<pre class="brush: xml;">
&lt;a href=&quot;/path/works/as/fragment/or/normal/page&quot; class=&quot;fragnav&quot;&gt;Link&lt;/a&gt;
</pre>
<p>And some jQuery to hook the onclick handler on these fragment-navigation-compatible links, and convert them so that rather than navigating to the specified href, the browser changes it&#8217;s hash instead:</p>
<pre class="brush: jscript;">
$(&quot;a.fragnav&quot;).click(function() {
  var href = this.href.replace(/^https?:\/\/[a-z0-9\.]+\/(.*\#\!)?/, '');
  location.hash = this.href;
  return false;
});
</pre>
<p>Now you can either click the link with javascript enabled and get a dynamic AJAX behaviour, or right click then &#8216;Open in new window&#8217; (or click with javascript disabled) to get the full URL of the link to load normally.</p>
<h2>Caching</h2>
<p>When your user presses &#8216;back&#8217;, you can detect the fragment change and reload appropriate content &#8211; great.  But this is not as good as what you get from browsers&#8217; normal back button behaviour.  Normally, the previous page is cached, so it reappears almost instantly.  In the AJAX version, unless your fragment is just switching between different versions of content that are all preloaded, you may fire an AJAX request to load the content again.</p>
<p>It shouldn&#8217;t be necessary to reimplement caching yourself, since AJAX requests are subject to the same browser caching as normal page loads, but it&#8217;s easy to forget that you should include appropriate cache headers with your AJAX responses.  There are four HTTP headers that generally modify a browser&#8217;s caching behaviour:</p>
<ul>
<li>Cache-Control (<a href='http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9'>Docs</a>)</li>
<li>Expires (<a href='http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.21'>Docs</a>)</li>
<li>ETag (<a href='http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.19'>Docs</a>)</li>
<li>Last-Modified (<a href='http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.29'>Docs</a>)</li>
</ul>
<p>Cache-Control and Expires are cache directive headers, telling the browser whether and for how long it may cache the resource.  Typically we set only Cache-Control, as it is more powerful and flexible, and Expires is generally unnecessary.</p>
<p>ETag and Last Modified are validators.  These allow the browser to make a conditional request to the server to find out if the resource has changed, and allow the server to respond with a simple 304 Not Modified if it hasn&#8217;t.  It&#8217;s often the case that your web server will add these automatically, and we prefer to avoid them entirely, relying on a single Cache-Control header to determine browser behaviour.</p>
<pre class="brush: xml;">
Cache-Control: max-age:60, public
</pre>
<p>A nice trick for content that is personalised to an authenticated user is to include the <code>private</code> directive in your Cache-Control header, which allows the page to be cached, but only by the browser, not by any proxies along the way.</p>
<h2>Scroll position</h2>
<p>One of the things we found to be most difficult was managing scroll position.  Consider the default behaviour of a browser when navigating normally.  If you scroll down a page and click a link somewhere near the bottom, the browser loads the new page and displays it starting at the top.  However, if you then press the back button, it redisplays the last page you viewed and <strong>restores your scroll position for that page</strong>.  You may not have even noticed this, but if it stopped behaving this way you&#8217; get pretty annoyed soon enough!</p>
<p>So, if a user clicks one of your fragment links low down on your page, and the change you make to the page as a result would appear to the user to constitute a new page (obviously the point at which perception of &#8216;a new page&#8217; is triggered is subjective), consider scrolling the browser window back to the top.</p>
<p>But, you should not do this if the back button is used, because the browser will restore the user&#8217;s previous scroll position all by itself.  However, this becomes more complex if navigating forwards has considerably shortened the page content, and restoring the scroll position on back would require you to restore the content first in order for the necessary scroll offset to actually exist.  This is a bit confusing.  Here&#8217;s an illustration:</p>
<p><img src="http://assanka.net/content/tech/files/2011/01/ajaxnavgraphic1.png" alt="Illustration of problems with scroll position when using AJAX navigation" title="Illustration of problems with scroll position when using AJAX navigation" class="aligncenter size-full wp-image-197" /></p>
<p>So the solution we use is to remember the scroll position on every navigation action, and then restore it after repopulating the content if it looks like a backwards step.  You need to ensure you have got your caching rules right to support this otherwise the delay in refetching the content will make the browser reposition the scroll position twice &#8211; once by itself immediately, and again triggered by your JavaScript after the content has loaded.</p>
<h2>Loading pause</h2>
<p>Normally, the process of clicking a link and navigating to a new page involves the current page <strong>remaining on screen</strong> while the new page is being requested.  The browser only blanks it out when content starts to arrive for the new page.  The browser does provide some progress feedback immediately though, in the form of a wait mouse cursor or spinner in the browser chrome.</p>
<p>We aimed to replicate this experience for our AJAX-loaded views.  This means avoiding what seems like an obvious solution &#8211; empty the container when the link is clicked, and populate it when the AJAX response is received &#8211; because you&#8217;ll get a flash of blankness between the old content disappearing and new content replacing it.  Instead, the old content should remain, and the new content should simply replace it when the AJAX completes.</p>
<p>But this doesn&#8217;t provide progress feedback.  The risk is, the user will think nothing&#8217;s happened and will click the link again.  This tends to happen after about 2-3 seconds for most users, but AJAX calls normally don&#8217;t take anywhere near that long.  So we implemented a timeout that, after 250ms, would blank out the container, replacing the old content with a loading spinner.  If the AJAX completes within that time, it cancels the timer.</p>
<p>So, in most cases the user clicks a link and sees an almost immediate cut from old content to new with no flash of blankness.  In some cases they see the old content vanish and a loading spinner to reassure them that something is happening while we wait for the new content to come back.</p>
<p>If you regularly have edge cases where content takes more than a few seconds to load, you should consider moving on to a different kind of progress feedback, to avoid hitting that &#8220;it hasn&#8217;t worked&#8221; perception boundary.  The aim is to keep pushing that boundary back, keeping the user&#8217;s expectations in line with the performance that they&#8217;re getting from the site.  I like to think of this like a timeline:</p>
<p><img src="http://assanka.net/content/tech/files/2011/01/ajaxnavgraphic2.png" alt="Timeline of user perception when waiting for a navigation action to complete" title="Timeline of user perception when waiting for a navigation action to complete" width="515" height="169" class="aligncenter size-full wp-image-199" /></p>
<p>By providing better progress feedback, you can expand the &#8216;Expectation&#8217; phase and avoid the &#8216;Frustration&#8217; phase.  However, it&#8217;s also equally important that you don&#8217;t display a big piece of progress feedback for the entire operation to then complete in under 250ms &#8211; the feedback will be on screen for such a short period that the user will potentially be unsure about what happened and get anxious.  So be aware of the &#8216;instant&#8217; phase, when you should present no feedback at all.</p>
<h2>Inline Javascript</h2>
<p>Sometimes, it&#8217;s convenient to get both new HTML code and new Javascript when your user clicks a fragment link.  In our case, we wanted to set some Javascript globals in the response to allow javascript code already loaded on the page to better understand the content that had just been loaded.  Say the user has requested a page with a fragment that loads some search results.  We also want to tell the browser the search query that produced the results so that it can cache them, or populate the query into a search field that&#8217;s outside the fragment container, or something.</p>
<p>Normally, if you load content from a server using AJAX and append it to the DOM using innerHTML, any &lt;script&gt; sections in the HTML are not parsed by the browser&#8217;s JavaScript engine.  However, to make it do this is as simple as searching the returned source for &lt;script&gt; sections, and evaling them.  Make sure you properly filter and escape end user input before allowing it to be evaled though, otherwise you&#8217;re opening up your site to XSS attacks.</p>
<pre class="brush: jscript;">
$('#main script').each(function() { eval(this.text); });
</pre>
<h2>Command, Control and Shift modifier keys</h2>
<p>Power users like to open links in new tabs or windows.  In fact when faced with a list of links I often hold down Control and hit each link to turn to start preloading all the results into tabs which I can then review in turn without having to wait for each one to load.  To avoid annoying users, you need to ensure that whereever possible, control-click (or command-click) and shift-click (normally &#8216;new tab&#8217; and &#8216;new window&#8217; respectively) still work.</p>
<p>It&#8217;s easy to get lulled into a false sense of security by right clicking on your links and using the context menu to select &#8216;Open in new tab&#8217;.  This won&#8217;t run your onclick handler, so most likely will work if your link has a non-JavaScript href.  But using the keyboard CTRL key while clicking the link will fire the onclick event, and if you are cancelling the normal link navigation action by returning false, you&#8217;ll also be cancelling the new tab or new window request.  Ensuring that you don&#8217;t is quite straightforward (remember to include e.metaKey to detect the Command key on Macs):</p>
<pre class="brush: jscript;">
if (e.shiftKey || e.ctrlKey || e.metaKey) return true;
</pre>
<h2>Focus rectangles</h2>
<p>In most browsers, when you click a link, you get a small dotted rectangle around it.  This focus rectangle also appears if you tab through the actionable items on a page, to indicate which one would be followed if you pressed enter.  The problem is that if your link fires an AJAX action, and the link itself remains on screen after the action has completed, you&#8217;re left with a focus rectangle that you probably don&#8217;t want.</p>
<p>A bad way of dealing with this is to simply use CSS to set <code>outline:none</code> on all A tags.  This will certainly solve your problem, but kill all hope of keyboard navigation of your site.  Better is to study the way that this interaction happens normally, and then mimic it for your Ajax actions.  This is actually fairly easy to do generically:</p>
<pre class="brush: jscript;">
(function() {

  // Keep track of which link element last had the focus (if browser does not support document.activeElement)
  if (!document.activeElement) {
    $(&quot;a&quot;).live('focus', function() {
      document.activeElement = this;
    });
  }

  // When any AJAX operation completes, blur any focused link
  $.ajaxSetup({complete:function(xhr, textStatus) {
    if (document.activeElement &amp;&amp; document.activeElement.is('a')) document.activeElement.blur();
  }});
}());
</pre>
<p>Drop the above snippet into your javascript and you should find that focus rectangles still appear and stay visible while the ajax is running, but then vanish once it completes &#8211; perfect replica of the effect the user has learned to expect.  If your links don&#8217;t perform any AJAX, you&#8217;ll have to deal with those separately.</p>
<h2>Conclusion</h2>
<p>With all this to think about, you&#8217;d be forgiven for concluding that AJAX navigation simply isn&#8217;t worth the bother.  But bear with it &#8211; the benefits of being able to trivilally maintain state and load small fragments of content really makes a difference, both to the quality of the user experience and the load on your servers.</p>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2011/02/16/tips-for-better-fragment-navigation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Faceted search: choosing good facet suggestions</title>
		<link>http://assanka.net/content/tech/2010/11/16/faceted-search-choosing-good-facet-suggestions/</link>
		<comments>http://assanka.net/content/tech/2010/11/16/faceted-search-choosing-good-facet-suggestions/#comments</comments>
		<pubDate>Tue, 16 Nov 2010 10:05:50 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[faceting]]></category>
		<category><![CDATA[filters]]></category>
		<category><![CDATA[keywords]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[strategy]]></category>
		<category><![CDATA[suggestions]]></category>
		<category><![CDATA[User experience]]></category>
		<category><![CDATA[xapian]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=175</guid>
		<description><![CDATA[Faceted search is everywhere, making our online shopping experience easier, organising our photos, searching our DVD collection.  By showing filters in categories, you can allow users to search the way they want to, rather than in a prescribed category hierarchy of your choice.  We&#8217;ve recently used faceted search in a number of applications [...]]]></description>
			<content:encoded><![CDATA[<p>Faceted search is everywhere, making our online shopping experience easier, organising our photos, searching our DVD collection.  By showing filters in categories, you can allow users to search the way they want to, rather than in a prescribed category hierarchy of your choice.  We&#8217;ve recently used faceted search in a number of applications and found that one area is a particular challenge: choosing the suggested values under each category.</p>
<p><img src="http://assanka.net/content/tech/files/2011/01/moodys.png" alt="FT Tilt&#39;s faceting uses a blend of strategies" title="FT Tilt&#39;s faceting uses a blend of strategies" width="515" height="305" class="aligncenter size-full wp-image-217" /><br />
<cite>The site Assanka recently launched for <a href="http://tilt.ft.com">FT Tilt</a> uses a variety of faceting strategies to give the most appropriate suggestions in each category</cite></p>
<p>Most implementations of faceting always show the same suggestions in any given category, but may change the filters available based on the search the user has done so far.  For example, if you&#8217;re searching a computer supplies retailer for a new hard disk, most of the products matching your search will have a &#8216;capacity&#8217; property, so the retailer will probably offer you a capacity filter with all available capacities listed.  It probably won&#8217;t offer you a &#8216;pages per minute&#8217; filter, because resolution is not relevant to hard disks, and your existing search keywords have probably eliminated any product (like a monitor) for which a resolution choice would make sense.</p>
<p>So it&#8217;s pretty easy to choose which filters to display.  The more difficult problem is choosing which facets to display within each filter.</p>
<p>Sometimes it&#8217;s impractical to show all possible facets in a filter.  eBay, for example, allows you to restrict your search by seller, and of course there are millions of those.  So there are a number of possible strategies:</p>
<ol>
<li>Display every option available</li>
<li>Display just a few hard coded options, such as aggregate options designed to always match something (eg eBay&#8217;s &#8216;Top rated sellers&#8217; option), or editorially chosen &#8216;top picks&#8217;.  Either way, the point is that the suggestions are a subset of the full range available, and are not sensitive to the context created by the user&#8217;s search query</li>
<li>Use the list of options as per (1) or (2), but hide any that, if selected, would produce no results.  This refinement makes the suggestions context-sensitive in a very rudimentary way, in that they are at least reacting to the current state of the user&#8217;s search, but still do not surface any long-tail options when they become more relevant.</li>
<li>Generate options based on running the query the user has put together so far, including any facets selected, and analysing the resultset for the top facet refinements in each filter category.</li>
<li>As per (4), but for each filter category, run the search excluding any options already selected in that category.</li>
<li>As per (4) or (5), but where the values are all numeric, determine the facets not from the frequency of occurrence of specific values, but by analysing the distribution of values within the resultset and constructing boundaries that provide a sensible number of divisions with approximately the same number of results in each division.</li>
</ol>
<p>There are arguments for and against all of these, depending on the distribution of the metadata within your document index.</p>
<p>An e-commerce site will typically have a set of properties on a product, where each property has one value, and won&#8217;t use all the available property names.  The values available for each property will also typically be a range set, and will efficiently cover the whole available range.  For example a hard disk will have a &#8216;capacity&#8217; property, where the options might include 200GB, 300GB, 500GB and 1TB.  Only one of these can apply to any given hard disk.  A hard disk, as previously discussed, will also not make use of other available properties, such as resolution, that might apply to other types of product, such as monitors.  A property like &#8216;pages per minute&#8217; is really only going to be used on a very small subset of your product catalogue (only printers).  &#8216;Capacity&#8217; might get a bit more limelight as it applies to USB sticks, RAM and storage appliances as well as hard disks (though consider that the option values required in these types of products might be in a different range), and some properties like &#8216;manufacturer&#8217; would apply to virtually the entire product catalogue.</p>
<p>Sometimes, a product might have more than one applicable value in the same category.  Take an example category &#8220;Special offers available&#8221;.  A single product might qualify for &#8220;Buy one get one free&#8221; as well as &#8220;Free delivery&#8221;.   This kind of thing actually happens more in the non-retail world, and a better example would be a film library, where a film may have more than one actor, more than one screenwriter, more than one content advisory.  Where this is the case, filtering on one actor does not necessarily mean you&#8217;ve excluded all the others from the resultset.</p>
<p>The distribution is also relevant.  In the &#8216;capacity&#8217; example, we could expand the number of values to include more granularity below 50GB to allow for solid state devices, and above 1TB to allow for storage appliances, but in any given search, results will tend to form an unequal distribution with a peak, or multiple peaks, in particular capacities.  Across a range of hard disks, at time of writing 100-500GB would likely be the most popular value.  On the other hand take a category like &#8216;Actor&#8217; in the case of a film library, and the distribution looks a lot flatter.  An actor can only do so many films, and there are a lot of actors, so there isn&#8217;t a strong head to this distribution &#8211; it&#8217;s all tail.</p>
<p>Looking at each filter category in turn, the logic for deciding how to choose suggested values therefore comes down to a number of questions about how the category is used to classify your content:</p>
<ol>
<li>Are there few enough values that you could display them all together?</li>
<li>Do the values form a continuous range (like capacity) or are they discrete options (like actor)?</li>
<li>Is it possible for any single item of content to have more than one value from the same category?</li>
<li>Is there a &#8216;head&#8217; of a few values (few enough to display all of them together) which, combined, apply to a majority of your content?</li>
</ol>
<p>It&#8217;s important to stress that these are not decisions to take for your site as a whole &#8211; they need to be applied to each filter category individually.</p>
<p>There is one final consideration.  Is your faceting feature intended to help users narrow their search, change it, or broaden it?</p>
<p>Going back to the possible strategies for determining facets, displaying every option available works for small categories, and using hard coded options groups like &#8216;top 100 sellers&#8217; is basically a solution for displaying every option available by consolidating many options into one.  Doing this where necessary (and then also hiding any options that would result in no matches) gives you about the best solution you are likely to get without going context sensitive.</p>
<p><img src="http://assanka.net/content/tech/files/2011/01/Capture1.png" alt="Amazon BBFC ratings" title="Amazon BBFC ratings" width="210" height="68" class="aligncenter size-full wp-image-215" /><br />
<cite>The DVD search on <a href="http://www.amazon.co.uk">Amazon.co.uk</a> displays all possible options in the BBFC rating category, and hides any that don&#8217;t apply to the results.  In this case, the results found by the search only contain films rated PG, 15 and 12.</cite></p>
<p>It starts to get interesting when you have a lot of possible values, in a flattish distribution, and want to present specific, context sensitive options.  The sense of &#8216;context&#8217; depends on whether you want to help broaden or narrow the search.  </p>
<p>Take a search for the location &#8220;UK&#8221; and animal &#8220;Dog&#8221;, which will give you results referring to dogs in the UK.  One of the facet categories might be location, in which there is one option that is already a term in the search &#8211; UK.  Determining the facets to suggest in the location category by only looking at the results returned in the existing search would only yield locations that co-exist on items tagged with UK and dogs, so your filter refinements would be places like &#8220;Manchester&#8221;, &#8220;Birmingham&#8221;, &#8220;Liverpool&#8221;, &#8220;London&#8221;, &#8220;Cardiff&#8221;, &#8220;Edinburgh&#8221;.  This helps the searcher to refine their search.</p>
<p><img src="http://assanka.net/content/tech/files/2011/01/Capture.png" alt="Lovefilm&#39;s search facets offer refinement" title="Lovefilm&#39;s search facets offer refinement" width="423" height="171" class="aligncenter size-full wp-image-213" /><br />
<cite>The film search on <a href="http://www.lovefilm.com">Lovefilm</a> presents facets that narrow your search.  It also arranges the facets in a hierarchy, though this is an illusion</cite></p>
<p>However, using strategy 5, you run the search once on &#8220;UK Dogs&#8221; to produce the results to show the user, but you run it again on &#8220;Dogs&#8221;, excluding the term from the locations category, to generate location facets.  This time you get locations globally that co-exist with Dogs, and the suggestions for facets in any given category do not change if you choose to add a facet from that category to your search.  In this case the suggestions would be more likely to be &#8220;Paris&#8221;, &#8220;New York&#8221;, &#8220;France&#8221;, &#8220;Boston&#8221;, &#8220;London&#8221;, &#8220;United States&#8221;.  This helps to broaden the search by showing the user good suggestions for other locations that also have lots of dogs that they may not have considered.</p>
<p>Finally, where the values are numeric, it may be appropriate to produce dynamic facets as range boundaries, based on an analysis of the values that exist within the resultset.  You still follow one or other of the above strategies to get a resultset that either narrows or broadens the search, but then analyse the list of values and construct divisions that evenly partition the data into a small number of ranges.  Doing this with a search for say the product type &#8220;Hard disks&#8221; and the capacity &#8220;100-150GB&#8221;, the suggested capacity facets would subdivide the selected range, with narrow boundaries covering the most popular capacities, and wider ranges where there are fewer results.</p>
<p><img src="http://assanka.net/content/tech/files/2011/01/facet1.png" alt="Ebuyer&#39;s faceting of numeric categories is inconsistent" title="Ebuyer&#39;s faceting of numeric categories is inconsistent" width="391" height="188" class="aligncenter size-full wp-image-211" /><br />
<cite><a href="http://www.ebuyer.com">Ebuyer&#8217;s</a> product search contains lots of faceting of numeric categories, some of which are presented in dynamic ranges, and some are treated as standard terms</cite></p>
<p>For numeric categories like this, it&#8217;s also worth considering the sort order of the facet list.  For categories of non-numeric values and sometimes even for discrete numeric values (say &#8216;film speed&#8217; or &#8216;quantity per box&#8217;), it&#8217;s generally best to present them in decreasing order of popularity within the search context you&#8217;ve chosen.   For numeric categories where you&#8217;re constructing range facets (eg. &#8216;capacity&#8217;, &#8216;pixel density&#8217;, &#8216;brightness&#8217;), they should instead be presented in value order.</p>
<h2>Conclusion</h2>
<p>There are many ways of faceting search results.  Take some time to choose the one that suits your application best, and provides the best search experience for your users.  We use the open source search engine <a href="http://www.xapian.org">Xapian</a>, which is excellent at doing all the faceting described in this post, and I&#8217;d like to publicly acknowledge <a href='http://cnav.co.uk/'>Richard Boulton</a> for his excellent advice when we were designing faceting strategy for FT Tilt.</p>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2010/11/16/faceted-search-choosing-good-facet-suggestions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Behind the scenes of the new CIPR site</title>
		<link>http://assanka.net/content/tech/2010/06/06/behind-the-site-cipr/</link>
		<comments>http://assanka.net/content/tech/2010/06/06/behind-the-site-cipr/#comments</comments>
		<pubDate>Sun, 06 Jun 2010 11:09:13 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[cipr]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[walkthough]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=139</guid>
		<description><![CDATA[We present a detailed overview of the way we built the CIPR's new website using Drupal.  Find out what modules and techniques were used to create all the main user facing features of the site, and the technical architecture that is serving it.]]></description>
			<content:encoded><![CDATA[<p>When the <a href="http://www.cipr.co.uk">CIPR </a>asked us to redevelop their website, we jumped at the chance to bring open source technology and web standards to an organisation that represents and promotes the marketing communications industry.  With 9,500 members &#8211; all professional communicators &#8211; it was a daunting task to produce something that the membership would approve of.</p>
<p>We chose to use Drupal &#8211; having successfully built Drupal sites for the likes of News International, we knew we could hit the ground running and end up with something that would be able to serve the CIPR for several years.</p>
<h2>The challenges</h2>
<p>The existing site had 30,000 static HTML files, with widely varying markup and no style consistency.  Because the site was really difficult to edit, many of the most active areas had been spun out into microsites with an asp.net based proprietary content management system.  These microsites were editable by their respective content owners, but with no oversight, and with complete content control available to the editors, so styles began to diverge even more.</p>
<p>A number of third party services and applications needed to be integrated into the new site. <a href="http://www.eventbrite.com">Eventbrite </a>for event ticketing, <a href="http://www.ezproxy.com">ezProxy </a>for access to academic journal services, and the CIPR&#8217;s own Ladder CPD system (another Assanka project).  For Ladder and ezProxy, we needed to be able to authenticate CIPR members outside of Drupal, so we would need to have some kind of single sign on system centered on the Drupal site.</p>
<p>Details about the members themselves are held in the CIPR&#8217;s membership system, a client-server application with a Windows front end and a Microsoft SQL Server database.  On the old site, synchronising this with the database the website was using for authentication required a long manual process and, for some reason, a number of Microsoft Word mail merges. The membership system is here to stay, so our solution would need to vastly improve on this process to get member profile updates into the website far faster than was possible before.</p>
<p>Finally, it&#8217;s worth mentioning the workflow challenges. Under the old site, the institute&#8217;s staff had to email their requested changes to the web team using a Microsoft Word based form. The web team would implement them &#8220;within 5 working days&#8221;, by editing static HTML files manually, and could not provide any content oversight, only the technical support to put the content online. For a new site to be successful, the management and oversight of the institute&#8217;s web output would need to change completely.</p>
<h2>Build process</h2>
<p>Our build process followed a fairly typical sequence:</p>
<ol>
<li><strong>Site map workshops:</strong> We gathered together and worked with the main content creators and heads of department. Over a number of sessions, and using a lot of Post-it notes and a very large wall, we produced a site map.  The site map was transferred to Mindmeister, where it continued to be collaboratively edited and refined.</li>
<li><strong>Wireframing:</strong> We produced wireframes of a new home page and typical article pages, to explore the organisation and prioritisation of content.  We needed a number of different, flexible content positions to accommodate a variety of content and media.</li>
<li><strong>Creative concepts: </strong>The wireframes were used as a basis for producing creative concepts to establish a new stylish look and feel for the site.  Three concepts were presented, from which the CIPR chose one.</li>
<li><strong>Standardisation:</strong> From the concept mock-ups, we derived a set of design standards. This is where we sought out any inconsistencies in the design and attacked them to produce a documented design standard that would be used as a rule book for developing the remaining layouts. The standards cover column layouts, margins, font and colour palettes, image aspect ratios and sizes, paragraph formats, heading levels, link and button styles and all interaction styling.  It&#8217;s easy to forget interaction styles &#8211; for every button, for example, you need a standard, hovered, pressed and disabled style.</li>
<li><strong>Layout development:</strong> Using the design standards, the concepts were expanded into the dozen or so different layouts that would be used in the final site.</li>
<li><strong>Rendering:</strong> With all layouts complete, HTML templates and an optimised stylesheet could be produced to serve all the layouts.</li>
<li><strong>Drupal build: </strong>Skinning Drupal with the new templates, installing and configuring all the content types and modules we would need, and writing our own bespoke modules for our special requirements.</li>
<li><strong>UI behaviours and trackers: </strong>JavaScript libraries and loaders for all the interface behaviours, with graceful degradation for users with JavaScript disabled.</li>
</ol>
<h2>Special features</h2>
<p>Much of the functionality of the new site came out of the box with Drupal, but we added some of our own modules to enable some particular functionality:</p>
<h3>Node image</h3>
<p>Our design called for image based teasers for content pages.  You can see examples of these on the homepage of the site, under &#8216;Features&#8217;.  The image used for the feature teaser cannot be required to be part of the content of the page itself, so it has to be separately mapped to the node.  Additionally, our design standards mandated a specific aspect ratio for all images on the site, and four specific valid sizes, so all images, whether used for feature teasers or within the content of the node, must be made to fit these design rules.</p>
<p>To achieve this, we wrote a new module, which we call <strong>Node Image</strong>, and a <a href="http://tinymce.moxiecode.com/">TinyMCE </a>plugin called Node Image Chooser, to replace the standard TinyMCE Image insert/edit dialog.  The Node Image module also alters the node edit form to add the ability to attach images outside the content body (to serve as feature teasers).</p>
<p>The form for uploading and editing node image nodes provides a cropper tool to allow the user to preview the original uploaded image (often a digicam photo at a far higher resolution than desired), and crop the four valid image sizes out of it, using a draggable frame with a constrained aspect ratio.  This allows editors to quickly create the four variants of each image, so for any image we can always offer the choice of all four possible sizes, without requiring editors to be trained in image manipulation software.</p>
<h3>Restricted content</h3>
<p>We needed to be able to protect some pages so that they were only accessible by CIPR members or even a smaller subset of the membership who were perhaps members of a particular regional or sectoral group.  However, access to this information is one of the most important and valuable benefits of membership, so we wanted to ensure that non-members could see that this content existed, even if they couldn&#8217;t access it themselves.</p>
<p>Many newspaper sites handle this situation with a concept called a content barrier.  The user is shown a teaser of the page content, and a message indicating that in order to read the remainder of the content, they must pay a fee or register.  We started with the existing <a href="http://drupal.org/project/restricted_content">Restricted Content</a> module, but completely rewrote it to support tokens and offer completely different views to anonymous users versus those who were logged in but simply did not have the required role.  We&#8217;ve released these changes back to the community &#8211; <a href="http://drupal.org/node/441404">our modified version is available on the Drupal project page</a>.</p>
<p>We married this module with a bespoke (and very simple) robots login module, which identifies legitimate search engine crawlers from Google, Yahoo and Microsoft, and allow them to view all protected content as if they were members, but with a NOARCHIVE restriction.  This further contributes towards our goal of making this member only content visible to non-members by allowing them to find it via a general web search, though they still need to join in order to view the full page.</p>
<h3>Single sign on</h3>
<p>We didn&#8217;t just need to authenticate users for the benefit of Drupal.  We also need them to be logged in for other sites as well, notably Ladder, the CIPR&#8217;s Continuing Professional Development system.  We wanted to ensure that users did not have to log in several times to access these services.  So we turned Drupal into a single sign on hub.</p>
<p>Our solution starts with the <a href="http://drupal.org/project/logincookie">Login Cookie</a> module, modified to support tokens and shared secret based signing.  When users log in, we use this module to set a cookie containing various particulars about their user account, and a signature that can be verified by co-operating apps on the same root domain.</p>
<p>The second part of the solution is the need to redirect users to a querystring-specified destination on login and logout.  This is to enable users initiating login directly from an external app to be seamlessly redirected back to it after their login has been processed.  We started with the Login Destination module, but found this to be over-engineered for our needs in many ways, while it also missed a key requirement &#8211; redirect on logOUT.  So we ended up writing a bespoke module for this.</p>
<p>Finally, the <a href="http://drupal.org/project/services">services </a>module gave external apps that read the SSO cookie the ability to query the user&#8217;s full Drupal user account to get the details that we did not include in the cookie.</p>
<h3><span style="font-weight: normal">Social sharing</span></h3>
<p><span style="font-weight: normal">Our requirements included the usual need to allow social sharing of content via Digg, Facebook, Twitter and so on, as well as the ability to send pages by email to friends, and bookmark pages in the browser.  All these were easily solved with <a href="http://www.addthis.com">AddThis</a> which also adds analytics into the bargain.</span></p>
<h3>Inline audio and video</h3>
<p>The CIPR needs to be able to include videos from <a href="http://www.youtube.com">Youtube </a>and <a href="http://www.vimeo.com">Vimeo</a>, and publishes its own video content via <a href="http://www.vzaar.com">Vzaar</a>, all of which have straightforward embed codes, but there&#8217;s no simple way to insert these into TinyMCE in a way that allows the designer (not editorial users) to control their dimensions and style.</p>
<p>We wrote a new TinyMCE plug-in to replace the media plug-in, which allows CIPR editors to find an online video and just paste the URL into the plug-in dialog.  The plug-in recognises the URL format, queries the API of the appropriate video service provider, and displays the video title to the user to confirm they have the right one.  The user can then simply choose which of our standard image sizes they wish to use for the video player &#8211; ensuring that video players are shown at exactly the same size as embedded images.</p>
<p>This continues a theme of replacing complex functionality with alternatives that separate content decisions from style at the admin level.  So if an editor wants to insert a video, they should not have the choice of what width and height to assign to the player.  That is a decision made by the designer, not user.  So our plug-ins don&#8217;t offer the choice.  As a result all the CIPR&#8217;s videos are presented in the same way.</p>
<p>Audio files are embedded using the same plug-in, recognising the file as an audio format, and embedding the excellent <a href="http://www.google.com/reader">Google Reader</a> inline mp3 player.</p>
<h3>Forms</h3>
<p>The CIPR needs loads of forms.  Membership applications, renewals, applications to join groups, entries for awards competitions, and so on.  Drupal&#8217;s capabilities to create arbitrary user defined forms and aggregate the results did not really do the job, so we turned to the service provided by <a href="http://www.formassembly.com">Formassembly</a>.  This third party service allows a non-specialist end user to create a wide variety of forms, and process results in many different ways, as well as providing useful connectors such as <a href="http://www.salesforce.com">Salesforce </a>integration and one-off <a href="http://www.paypal.com">Paypal </a>payments.</p>
<p>We created a TinyMCE plug-in to insert verified Formassembly form IDs into a specially designed machine tag in the content of any page.  We paired this with an output filter that converted the machine tag into the full HTML of the form, making a few changes to it on the way through &#8211; tweaking Formassembly&#8217;s HTML mark-up slightly allowed it to be styled by our CSS without having to include a second set of form styles.</p>
<h3>Search</h3>
<p>Drupal does offer user facing search via various means, but we felt that we needed something really awesome to make up for the years of having no search on the CIPR site at all.  We wanted something that offered a great relevance engine to surface the best matching content.  In short, we wanted something like Google on a small scale.</p>
<p>Fortunately, for a fee, that&#8217;s exactly what we got.  The search engine on the CIPR&#8217;s new site is powered by <a href="http://www.google.com/sitesearch">Google Site Search</a>, a paid-for API-driven service that builds and uses a dedicated index on Google&#8217;s servers, but is delivered and branded entirely within your own site.</p>
<h3>Course finder and member directory: searchable views</h3>
<p>The old site already had a directory of members, and we wanted to maintain that, and add directories of PR suppliers and courses as well.  The <a href="http://drupal.org/project/better_exposed_filters">Better Exposed Filters</a> module came in handy here, providing an excellent checkbox list interface to options such as level, cost and type of course.</p>
<h3>Events calendar and Eventbrite integration</h3>
<p>With the CIPR&#8217;s events all managed on Eventbrite, we needed some close integration between the Eventbrite accounts and the CIPR site.  Using Eventbrite&#8217;s XML feeds, we perform a regularly scheduled import of event data and use it to populate Event nodes within Drupal.  Eventbrite already provides a lot of meta data &#8211; and we add a machine tag in the Eventbrite body text to link the events into courses within the CIPR course finder.</p>
<p>This also enables features such as listing all available dates for a course on the course page, and allowing users to navigate to alternative dates from an event page.</p>
<h3>Meta data</h3>
<p>The existence of content types like courses and events made us think that it would be really useful to have the ability to assign arbitrary key-value data to any node, and display it in a table alongside the content, in a similar way to Wikipedia&#8217;s boxes.  We used CCK fields to collect the meta data and a custom block to display it.</p>
<h2>Content types</h2>
<p>Our full set of Drupal content types are:</p>
<ul>
<li><strong>Standard page: </strong>A flexible content page</li>
<li><strong>Blog post:</strong> Automatically bylined and aggregated under an author or topic.</li>
<li><strong>Course: </strong>Anything that can loosely be described as a training course is created using the Course node type, which has additional CCK fields for properties such as length, level, type, cost and accreditation.  This then powers the course finder.</li>
<li><strong>Course provider:</strong> Used to record details of an institution that provides training courses.  A course is linked relationally to a course provider which allows course provider pages to display a list of their available courses.</li>
<li><strong>Event: </strong>A single event, normally an instance of a training course, imported from Eventbrite, linked to a course</li>
<li><strong>Forum topic: </strong>Using the forums module, forums provide discussion areas for groups</li>
<li><strong>Node Image: </strong>Our bespoke node type for constrained-size images</li>
</ul>
<h2>Optimisation</h2>
<p>Finally, we thought we would share a few tips on optimising delivery and measurement.</p>
<ul>
<li><strong>Third party JavaScript loading:</strong> We use JavaScript from a number of third parties, such as AddThis and Google.  We developed a simple scheduled task to download these libraries and cache them on our servers so that we could deliver all the JavaScript to power the site&#8217;s UI behaviours in as few downloads as possible.</li>
<li><strong>JavaScript grouping: </strong>We group all our JavaScript files into two downloads &#8211; one in the &lt;head&gt;, and one just before &lt;/body&gt;.  A script in our theme concatenates the script files, and also returns 304 Not Modified responses whenever it can.</li>
<li><strong>Tracking 404s:</strong> It&#8217;s really easy when using Drupal to make the mistake of including your tracking JavaScript on your 404 error page.  Not only does this incorrectly inflate your traffic figures, but it&#8217;s a lost opportunity to create a useful broken links report.  We use Google&#8217;s recommended method for tracking 404s, and then we filter them out of the website profile we use for reporting traffic numbers.</li>
<li><strong>Caching of twitter / facebook feeds:</strong> Updates posted to the CIPR&#8217;s twitter and Facebook accounts are downloaded on a schedule and cached on our servers as fully rendered HTML to allow them to be included in our pages at the final stage of assembly.</li>
<li><strong>Feedburner:</strong> Using a simple user-agent and request-uri based redirect in an Apache rewrite rule, all end user requests for RSS feeds are diverted to <a href="http://www.feedburner.com">Feedburner</a>, which provides excellent analytics to let us know how many readers are consuming each of our RSS feeds, using which devices or clients.  It also reduces the load on our servers to have over 95% of our feed requests answered by Feedburner.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2010/06/06/behind-the-site-cipr/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on CSS transitions in Firefox</title>
		<link>http://assanka.net/content/tech/2010/05/20/thoughts-on-css-transitions-in-firefox/</link>
		<comments>http://assanka.net/content/tech/2010/05/20/thoughts-on-css-transitions-in-firefox/#comments</comments>
		<pubDate>Thu, 20 May 2010 14:36:05 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[Animation]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[Future Tech]]></category>
		<category><![CDATA[JavaScript]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=129</guid>
		<description><![CDATA[To produce the CSS3 ticker plugin for jQuery that I&#8217;ve been working on recently, I used CSS3 transitions.  These are the magic new properties in CSS that make a browser animate elements without using JavaScript.
The logic for implementing native animations in the browser is a no-brainer, particularly if you try and use a site [...]]]></description>
			<content:encoded><![CDATA[<p>To produce the <a href="/content/tech/2010/03/14/smooth-css3-ticker-jquery-plugin/">CSS3 ticker plugin for jQuery</a> that I&#8217;ve been working on recently, I used <a href="http://webkit.org/blog/138/css-animation/">CSS3 transitions</a>.  These are the magic new properties in CSS that make a browser animate elements without using JavaScript.</p>
<p>The logic for implementing native animations in the browser is a no-brainer, particularly if you try and use a site that implements scrolling tickers by incrementing an element&#8217;s <code>left</code> or <code>margin-left</code> properties every few milliseconds.   Watch your interactions with the rest of the page slow down to a sluggish crawl and realise that animating the positioning of DOM elements in JavaScript can sometimes be pretty slow.</p>
<p>With animation increasingly important to creating rich and usable user interfaces, the chance to start optimising the performance of animation in the browser via native implementation is a great step forward.  Unfortunately, this is currently only supported by Webkit based browsers.  In developing the ticker, I wanted to see if support in Firefox would be possible.</p>
<p>From version 1.9.3 of Gecko, the <code>-moz-transition</code> family of CSS properties have been added, so I tried grabbing the latest version of <a href='http://www.mozilla.org/projects/minefield/'>Minefield</a> (the pre-release codename for Firefox) and running the ticker plugin.</p>
<p>It doesn&#8217;t work.  Specifically, it doesn&#8217;t work for these reasons:</p>
<ol>
<li>It doesn&#8217;t seem to be possible to act on position properties.  <code>left</code>, <code>right</code>, <code>top</code> and <code>bottom</code> do not appear to animate.  It might be possible to work around this by switching to animating margins rather than absolute positioning, but hopefully it will be possible to animate these before the Firefox 3.7 release.</li>
<li>It doesn&#8217;t appear to fire the <code>mozTransitionEnd</code> event, which you&#8217;d expect from a full implementation of the standard.  As a result, my ticker can&#8217;t loop.</li>
<li>It doesn&#8217;t seem to be possible to change animation properties (such as the duration) from JavaScript.  I need this to disable animation while repositioning the ticker for the next loop.</li>
</ol>
<p>This is clearly an early pre-release version, and it&#8217;s encouraging to see support starting to emerge from the Firefox camp.  I think I might have retired by the time this kind of feature hits Internet Explorer!</p>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2010/05/20/thoughts-on-css-transitions-in-firefox/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Smooth CSS3 ticker jQuery plugin</title>
		<link>http://assanka.net/content/tech/2010/03/14/smooth-css3-ticker-jquery-plugin/</link>
		<comments>http://assanka.net/content/tech/2010/03/14/smooth-css3-ticker-jquery-plugin/#comments</comments>
		<pubDate>Sun, 14 Mar 2010 11:50:26 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[jQuery]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tickers]]></category>
		<category><![CDATA[UI Design]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=105</guid>
		<description><![CDATA[I&#8217;ve recently been building a status display for the Assanka office, which shows us information like current Zabbix monitoring triggers, support requests from clients, Pingdom status alerts, and even naming and shaming Assankans who&#8217;ve failed to file their time-sheets on time.
In the process, I was pointed at the gorgeous Panic status board.  Our objectives [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently been building a status display for the Assanka office, which shows us information like current Zabbix monitoring triggers, support requests from clients, Pingdom status alerts, and even naming and shaming Assankans who&#8217;ve failed to file their time-sheets on time.</p>
<p>In the process, I was pointed at the gorgeous <a href="http://www.panic.com/blog/2010/03/the-panic-status-board/">Panic status board</a>.  Our objectives were similar and I was attracted to the idea of adding the scrolling tickers (at least, we assume they are scrolling from Panic&#8217;s photograph), which run along the bottom of their board.  I was reminded of the kind of tickers you get on 24 hour news channels (here&#8217;s one from BBC World):</p>
<p><img src="/content/tech/files/2010/03/bbcworldcap42.jpg" alt="BBC World screen capture showing scrolling ticker" /></p>
<p>Tickers in web browsers are difficult for a number of reasons.  First, continuously animating the <code>left</code> or <code>margin-left</code> properties of an element in JavaScript is computationally expensive, particularly if you want to reposition the element often enough to create a properly smooth animation.  Sites that use Javascript for tickers in this manner tend to hog your computer&#8217;s CPU and feel slow and laggy as a result.  Many sites give up on the traditional ticker and go for something that&#8217;s easier for the browser to animate, like a &#8216;type in&#8217; reveal (eg the BBC News homepage ticker), or a vertical &#8217;scroll up&#8217; ticker.</p>
<p><img src="/content/tech/files/2010/05/Capture.jpg" alt="BBC News online ticker" /></p>
<p>Another issue with continuous tickers is making them loop properly.  When your animation reaches the end of the text, you have to scroll it all the way off the left edge of the screen before it can be repositioned and re-enter the frame from the right.  This leaves an ugly gap in your ticker.</p>
<p>Finally, if your ticker is updating with new information in real time, making changes that are noticeable in the visible portion of the ticker is amateur hour stuff.  You rarely see TV tickers suddenly change mid-scroll, and yet stories are seamlessly added and removed as necessary, <em>outside of the visible frame</em>.</p>
<p>Since Panic didn&#8217;t release any code (boo), I set out to produce a jQuery plugin that would enable news tickers to be easily added to a page, using CSS3 animation effects.  If it&#8217;s working for you, you should see a couple of tickers scrolling away right here:</p>
<div class="iframe-wrapper">
  <iframe src="/content/tech/files/2010/05/webkitscroller.html" frameborder="0" style="height:85px;width:590px;">Please upgrade your browser</iframe>
</div>
<p>If it&#8217;s not working, you&#8217;re probably not using a browser that supports CSS3 transitions.  Try it in Chrome, or Safari.  Support should be added to Firefox pretty soon.  But the objective here is not to get universal compatibility, since I can easily dictate which browser our status board will use.</p>
<p>With CSS3 transitions, we substantially improve the performance issue of tickers.  And this will improve further with browsers implementing these effects at an increasingly lower level.  This leaves the looping and updating issues to resolve before we can claim a properly news-grade ticker.</p>
<h2>Looping</h2>
<p>There are two possible solutions to the looping problem.  The first is to target segments of the ticker just after they have left the frame on the left, and move them to the end of the ticker on the right.  The second is to make several copies of your ticker content, and when you get to the end of the second copy, reposition the entire ticker element so that the first copy lines up with where the second copy had got to, then scroll on into the second copy again and repeat.</p>
<p>I chose the second approach, because you need several copies of the ticker anyway if it&#8217;s too short to fill the width of the frame with enough left over to sort the looping out.</p>
<h2>Updates</h2>
<p>In order to enable the ticker to be updated dynamically, and yet not suffer any noticeable changes in the visible portion, I added some public methods &#8211; addMessage() and removeMessage(), which allow you to pass a list element and have it queued for addition or removal at an appropriate moment.</p>
<p>This does require breaking a rather holy tenet of jQuery, namely returning a reference to the ticker rather than returning the jQuery chain, so you have an object on which you can call the public methods.</p>
<p>Every time the ticker hits the end of the text, it is  repositioned with the penultimate copy in view, all copies of the ticker text are updated to match the copy to their right, and the final copy (the original set of elements) is updated with the queued additions and removals (out of view).   As a result, when you call the add or remove methods, you may have to wait some time before you see the effect, but the ticker will never judder or change while it is visible to the user.</p>
<h2>How to use</h2>
<p>Your ticker must be a block element containing a UL.  This is because the ticker needs a frame to mask the content as it slides through.</p>
<pre class="brush: xml;">
&lt;div id='ticker1'&gt;
  &lt;ul&gt;
    &lt;li&gt;Headline one&lt;/li&gt;
    &lt;li&gt;Headline two&lt;/li&gt;
  &lt;/ul&gt;
&lt;/div&gt;
&lt;script type=&quot;text/javascript&quot;&gt;
  var ticker = $('#ticker1').ticker();
  var newid = ticker.addMsg('Headline three');
  setTimeout(function() {
    ticker.removeMsg(newid);
  }, 60000);
&lt;/script&gt;
</pre>
<h2>Settings and methods</h2>
<p>The following settings are available, which can be passed to the <code>ticker()</code> function in a standard json object:</p>
<ul>
<li><strong>pxpersec</strong>: Speed of the ticker in pixels per second (optional; default is 30)</li>
</ul>
<p>Calls to the <code>ticker()</code> function return a reference to the ticker, not the jQuery chain.  The ticker reference exposes the following methods:</p>
<ul>
<li><strong>string addMsg(<em>msg</em>)</strong>: Add a new message to the ticker.  Can be a jQuery object or raw list item element (which will be moved from its current location in the DOM), or a plain text string.  Returns a string identifier for the message (to enable you to remove it later).  If you pass an element that already has an ID attribute, that ID is used (and returned).</li>
<li><strong>void removeMsg(<em>msg</em>)</strong>: Remove a message from the ticker.  Can be a jQuery object or raw list item element (which must be in the ticker already), or a string identifier returned from the addMsg call that added the message to the ticker. Returns nothing.</li>
</ul>
<h2>Get the code</h2>
<ul>
<li><a href="/content/tech/files/2010/05/jquery.ticker.js">Download development</a> (commented, 7KB)</li>
<li><a href="/content/tech/files/2010/05/jquery.ticker.min.js">Download production</a> (minified, 4KB)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2010/03/14/smooth-css3-ticker-jquery-plugin/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Blocking events with blocker lists</title>
		<link>http://assanka.net/content/tech/2009/12/07/blocking-events-with-blocker-lists/</link>
		<comments>http://assanka.net/content/tech/2009/12/07/blocking-events-with-blocker-lists/#comments</comments>
		<pubDate>Mon, 07 Dec 2009 14:42:35 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Scrolling]]></category>
		<category><![CDATA[UI]]></category>
		<category><![CDATA[User experience]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=56</guid>
		<description><![CDATA[It&#8217;s often useful to be able to detect scroll events using the onscroll event handler in JavaScript.  For example, every time a user scrolls to nearly the bottom of the page, you load more content to create an &#8216;endless&#8217; page.  In my case, I have two DIVs set to overflow: auto, with chat [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s often useful to be able to detect scroll events using the onscroll event handler in JavaScript.  For example, every time a user scrolls to nearly the bottom of the page, you load more content to create an &#8216;endless&#8217; page.  In my case, I have two DIVs set to overflow: auto, with chat histories in each, where the chats have been taking place simultaneously.  I want to detect when the user scrolls one of the DIVs (either one) and then scroll the other one to keep the two in sync.  That is to say, we want messages that are currently vertically in the middle of DIV 1 to have been posted at around the same time as the messages at the same vertical viewport offset in DIV 2.</p>
<p>Try scrolling either of the panels in this example:</p>
<div class="iframe-wrapper">
  <iframe src="/content/tech/files/2009/12/testcase_blockerlist11.html" frameborder="0" style="height:220px;width:450px;">Please upgrade your browser</iframe>
</div>
<p>You should find it scrolls madly and unpredictably until you press stop.</p>
<p>The first problem is that onscroll is not simply fired when you finish scrolling.  It fires like a machine gun continuously while the mouse cursor is moving on the scroll handle.  The solution to this is a watchdog.  A watchdog is a timer that will execute action A if action B does not happen within X seconds.  So every time onscroll fires, we reset the timer, and if it gets to zero we know that the user has finished scrolling (or at least has paused for long enough for us to do something about it).</p>
<div class="iframe-wrapper">
  <iframe src="/content/tech/files/2009/12/testcase_blockerlist2.html" frameborder="0" style="height:220px;width:450px;">Please upgrade your browser</iframe>
</div>
<p>You&#8217;ll find that although it&#8217;s not quite as crazed, the panels do keep scrolling, one after the other.</p>
<p>The problem now is that when you scroll DIV 1, and this causes an automatic scroll of DIV 2, that automatic scroll ALSO triggers the onscroll event and so we then scroll DIV 1 again.  The solution is to have a variable that flags whether we are currently paying attention to scroll events, and turn it off when we don&#8217;t want to detect scrolling (ie just before the auto-scroll) and then on again after the scroll has completed (I&#8217;m using an animation library so the scroll takes around half a second to complete).  Try this (don&#8217;t click the button yet, just scroll the panels):</p>
<div class="iframe-wrapper">
  <iframe src="/content/tech/files/2009/12/testcase_blockerlist3.html" frameborder="0" style="height:220px;width:550px;">Please upgrade your browser</iframe>
</div>
<p>Success.  However, this approach is a bit short-sighted.</p>
<p>You also need to turn off scroll detection when adding or removing content from either DIV, because changing the height of the content within the element can also fire the onscroll event.  So if this is a chat window, and we&#8217;re adding and removing content in the DIVs, we have to disable scroll detection while we&#8217;re doing that and then turn it on again afterwards.</p>
<p>Sadly, our scroll detection flag is crude &#8211; it&#8217;s a sheer yes/no, and if it&#8217;s already set to no (say in order to execute a lengthy scrolling animation), and in the meantime we need to do something quick (say adding a line to one of the DIVs) then that quick operation is going to disable scroll detection (unnecessarily, as it&#8217;s already disabled) and then crucially enable it again before the scroll animation has finished.  Click the button in the example above to demonstrate this (and then scroll).  You should, intermittently, get the unwanted cascade effect you saw in example 2.</p>
<p>What we need is a list, not a flag.  Enter the blocker list &#8211; an object that collates tokens from each procedure in the script that is currently blocking scroll detection.  So if a procedure wants to disable scrolling for a period of time, it adds its token to the blocker list, and then removes its token when it&#8217;s done.  When a scroll event fires, we now only need to know whether there are any items in the blocker list, and then we can work out if it&#8217;s ok to process the event.</p>
<p>It&#8217;s also worth noting that there&#8217;s no need to actually count the number of items in the blocker list &#8211; it&#8217;s enough simply to know that there is at least one item in there.  And as a further optimisation, we don&#8217;t actually need to compute this when scroll events fire (frequently), because the value will only change when some function wants to add or remove its token from the list (less frequent).  So we can have enableScrollDetection and disableScrollDetection functions that deal with the blocker list, and which ultimately simply change the old scrolldetection flag to true or false.</p>
<div class="iframe-wrapper">
  <iframe src="/content/tech/files/2009/12/testcase_blockerlist4.html" frameborder="0" style="height:220px;width:550px;">Please upgrade your browser</iframe>
</div>
<p>There might be a neater way of achieving this, but this certainly works for me.  I&#8217;d welcome any comments or suggestions.  The sources for all these demos are available in the iframes above.</p>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2009/12/07/blocking-events-with-blocker-lists/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Disappearing text cursor in Firefox</title>
		<link>http://assanka.net/content/tech/2009/11/25/disappearing-text-cursor-in-firefox/</link>
		<comments>http://assanka.net/content/tech/2009/11/25/disappearing-text-cursor-in-firefox/#comments</comments>
		<pubDate>Wed, 25 Nov 2009 13:03:15 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[Firefox]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[UI]]></category>
		<category><![CDATA[User experience]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=43</guid>
		<description><![CDATA[Do you ever find that sometimes when you try and type into a textbox, there is no cursor there, but it still accepts your input and the text appears as if there is a cursor?  I came across this problem in Firefox 3 and searching online revealed only solutions to an earlier problem that [...]]]></description>
			<content:encoded><![CDATA[<p>Do you ever find that sometimes when you try and type into a textbox, there is no cursor there, but it still accepts your input and the text appears as if there is a cursor?  I came across this problem in Firefox 3 and searching online revealed only <a href="http://www.nestedelements.com/2008/02/26/firefoxs-disappearing-cursor/">solutions</a> <a href="http://www.webdeveloper.com/forum/showthread.php?t=150640">to</a> <a href="http://blog.tremend.ro/2007/01/22/mouse-cursor-disappears-in-firefox/">an</a> <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=167801">earlier</a> <a href="http://www.fleegix.org/articles/2007-02-14-mystery-of-the-disappearing-cursor-caret">problem</a> that affected Firefox 2, in which the overflow configuration of the INPUT&#8217;s containing element would affect whether the cursor appeared in the input or not.</p>
<p>But this was clearly not my problem, and various sources suggested that this had in any case been fixed for Firefox 3.  I discovered that my problem was rather simpler.  If you disable an element that has focus, and then re-enable it, you don&#8217;t get the cursor back.</p>
<p>Solution: blur (remove focus from) the element before you disable it.  Those sources that do refer to this specific problem tend to suggest that you focus a different element before you disable the text box.  But this is not necessary &#8211; you can just blur the element that has focus and leave the page with nothing focused.</p>
<h2>Problem test case</h2>
<p>Try clicking and typing in the field below &#8211; if your browser exhibits this problem, the text cursor will not appear.  If it does, then your browser does not have this problem.</p>
<div class="iframe-wrapper">
  <iframe src="/content/tech/files/2009/12/testcase_ffcursor1.html" frameborder="0" style="height:35px;width:300px;">Please upgrade your browser</iframe>
</div>
<p>This demo has a single text field which, when you click into it, is briefly disabled, and then enabled again.  The disabling of the field while it has focus triggers this problem, so the text cursor does not appear (but you can still type into the field!).</p>
<h2>Solution test case</h2>
<p>Now try typing in this field &#8211; the cursor should appear consistently.</p>
<div class="iframe-wrapper">
  <iframe src="/content/tech/files/2009/12/testcase_ffcursor2.html" frameborder="0" style="height:35px;width:300px;">Please upgrade your browser</iframe>
</div>
<p>This demo still briefly disables and re-enables the text field when it gets focus, but this time it blurs it right before it&#8217;s disabled, then focuses it again after re-enabling it.  The result is that when you place your mouse cursor in the field to manually give it focus, the text cursor appears as normal.</p>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2009/11/25/disappearing-text-cursor-in-firefox/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Validating HTML input in PHP</title>
		<link>http://assanka.net/content/tech/2009/10/10/validating-html-input-in-php/</link>
		<comments>http://assanka.net/content/tech/2009/10/10/validating-html-input-in-php/#comments</comments>
		<pubDate>Sat, 10 Oct 2009 11:51:41 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[DTDs]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[User input]]></category>
		<category><![CDATA[Validation]]></category>
		<category><![CDATA[XHTML]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=38</guid>
		<description><![CDATA[It&#8217;s often the case that as web developers, we need to &#8216;clean&#8217; input from end users to ensure it does not contain any nasty formatting or script that we don&#8217;t want to allow on our sites.  Forums in particular often suffer from either security holes that allow cross site scripting attacks (XSS) or are [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s often the case that as web developers, we need to &#8216;clean&#8217; input from end users to ensure it does not contain any nasty formatting or script that we don&#8217;t want to allow on our sites.  Forums in particular often suffer from either security holes that allow cross site scripting attacks (XSS) or are so restrictive in what they allow to be input that it causes a nuisance to the user (for example, disallowing all HTML but allowing BBCode instead).</p>
<p>This problem is often solved with complex classes or functions in PHP that are designed to strip out the nasty stuff while allowing as much useful formatting as possible.  We realised that these functions are pretty much just reinventing the wheel, because there is already a pretty good mechanism for parsing and validating XML syntax: libxml, which has PHP bindings and can be accessed using SimpleXML.</p>
<p>What&#8217;s more, libxml can parse an XML document for conformance to a DTD, so if you include an XHTML Transitional DTD in your XML code string, you can check that the markup is valid XHTML.</p>
<p>Here&#8217;s the PHP to do this.  This is tested on PHP 5.3 with libxml2-2.6.26-2.1.2.8.</p>
<pre class="brush: php;">
function isXML($str) {
	libxml_use_internal_errors(true);
	libxml_clear_errors();
	$options = (strpos($str, '&lt;!DOCTYPE') !== false) ? (LIBXML_DTDLOAD + LIBXML_DTDVALID) : 0;
	simplexml_load_string($str, 'SimpleXMLElement', $options);
	$errors = libxml_get_errors();
	return (empty($errors) or $errors[0]-&gt;level == LIBXML_ERR_WARNING) ? true : false;
}
</pre>
<p>You could of course use the contents of <code>$errors</code> to feed back to the user, or potentially deal with a validation failure more intelligently, but for now true or false will do.</p>
<p>So the markup submitted by a user is valid.  Excellent.  But just because the markup is valid doesn&#8217;t mean it&#8217;s safe to output to the browser.  You&#8217;ll also want to ensure it contains no <code>&lt;script type="text/javascript"&gt;</code> sections or event handlers, and may want to restrict the set of elements available.  This is where you can start getting creative with your own DTD spec.  Just start with the standard you want to conform to for the whole page (say XHTML) and strip out anything you don&#8217;t like.</p>
<p>We&#8217;ll start by removing the HEAD tag and all its contents.  Our users will not be writing entire documents, just fragments of body markup, so we don&#8217;t want a HEAD, TITLE, or any META tags, etc.</p>
<p>You can continue, removing things like SCRIPT, OBJECT, forms, frames, and so on.  Be careful where elements are defined using presets, which often contain the nasties, for example the <code>%event</code> set of attributes grants an element the ability to fire event handlers.  Fortunately this is almost exclusively used as part of <code>%attrs</code>, so we can just remove it from that superset.</p>
<p>We&#8217;ll also define a new root element <code>fragment_under_test</code> to ensure that we don&#8217;t cause any confusion and lead anyone to believe that they&#8217;re writing a normal <code>&lt;html&gt;</code> or <code>&lt;body&gt;</code>.</p>
<p>Once we&#8217;re done, we can then wrap the isXML function in a convenience function that adds our new custom DTD.</p>
<pre class="brush: php;">
function isXHTMLFragment($str) {
	return isXML(&quot;&lt;!DOCTYPE fragment_under_test system \&quot;http://www.example.com/dtds/xhtml-content-restrictive.dtd\&quot;&gt;&lt;fragment_under_test&gt;&quot;.$str.&quot;&lt;/fragment_under_test&gt;&quot;);
}
</pre>
<p>If you want, feel free to <a href="/content/tech/files/2009/12/dtdfiles.zip">download the DTD I created for this article</a>.</p>
<p>Now you can use the fast libxml to validate user input in a fairly bulletproof way.</p>
<p>Finally, and very importantly, make sure you cache the schemas on your server in an XML catalog file.  If you don&#8217;t do this, libxml will make an external HTTP request for the DTD schema file every time you call the function.  In fact, since most web documents cite W3C DTDs, they are having enormous problems with software making repeated requests for the standard XHTML, HTML 4 etc DTDs which haven&#8217;t changed in years.  Be a good net citizen, and cache your schemas.  In this case we&#8217;re writing and hosting our own anyway, but if you&#8217;re using a public schema you may as well save yourself the pointless HTTP traffic, and it&#8217;ll speed up the validation as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2009/10/10/validating-html-input-in-php/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JSON2.js vs Prototype</title>
		<link>http://assanka.net/content/tech/2009/09/02/json2-js-vs-prototype/</link>
		<comments>http://assanka.net/content/tech/2009/09/02/json2-js-vs-prototype/#comments</comments>
		<pubDate>Wed, 02 Sep 2009 16:24:23 +0000</pubDate>
		<dc:creator>Andrew Betts</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Prototype]]></category>

		<guid isPermaLink="false">http://assanka.net/content/tech/?p=5</guid>
		<description><![CDATA[We use Douglas Crockford&#8217;s json2.js frequently in our web apps.  Its stringify method allows JavaScript data structures to be trivially serialised before submission via AJAX to a web service.   It works by descending through the structure, calling the toJSON() method on anything it finds.  It also creates toJSON methods for data [...]]]></description>
			<content:encoded><![CDATA[<p>We use <a title="Douglas Crockford" rel="homepage" href="http://crockford.com/">Douglas Crockford</a>&#8217;s json2.js frequently in our web apps.  Its stringify method allows JavaScript data structures to be trivially serialised before submission via AJAX to a web service.   It works by descending through the structure, calling the <code>toJSON()</code> method on anything it finds.  It also creates <code>toJSON</code> methods for data types that do not already have them, on the basis that future browsers will introduce <code>toJSON()</code> support &#8211; at which point the native implementation can be used because it&#8217;s likely to be a lot faster.</p>
<p>Recently I needed to use this method to serialise some data in a JavaScript library that might be used in &#8216;foreign&#8217; web pages.  My own library was nicely encapsulated, and didn&#8217;t interfere with any other JavaScript that might be running on the page, and it included Douglas Crockford&#8217;s JSON2.js implementation.</p>
<p>But on one of our clients&#8217; sites, it didn&#8217;t work.  I got this:</p>
<pre class="brush: jscript;">
{&quot;key&quot;:&quot;val&quot;,[\{\&quot;key\&quot;:\&quot;val\&quot;\},\{\&quot;key\&quot;:\&quot;val\&quot;\}]}
</pre>
<p>What&#8217;s happened here is that any arrays in my data structure have been stringified twice.  This didn&#8217;t happen in my dev environment.  I narrowed down the differences and realised what was causing this effect.  They&#8217;re using <a href="http://prototypejs.org">Prototype</a>.  We&#8217;re not.</p>
<p>Prototype modifies a number of JavaScript&#8217;s native objects, including the Array object, and&#8230; you guessed it, adds a <code>toJSON()</code> method to it. Unfortunately it does not return what Crockford&#8217;s <a title="JSON" rel="homepage" href="http://json.org/">JSON</a> implementation is expecting.  From the docs for json2:</p>
<blockquote><p>A toJSON method does not serialize: it returns the value represented by the name/value pair that should be serialized, or undefined if nothing should be serialized.</p></blockquote>
<p>Prototype&#8217;s <code>toJSON()</code> <em>is</em> serialising.  There don&#8217;t seem to be any sensible solutions to this online, but it&#8217;s actually relatively simple to solve using a replacer function, allowed for in the json2 API:</p>
<pre class="brush: jscript;">
var reqdata = JSON.stringify(req, function(key, value) {
	if (typeof this[key] == 'object' &amp;&amp; Object.prototype.toString.apply(this[key]) === '[object Array]') {
		return this[key];
	} else {
		return value;
	}
});
</pre>
<p>Essentially this says &#8216;for each key in the data structure, if the value is an array, use the raw value, otherwise use the value you gave me&#8217;.  This makes sense when you look at the sequence of steps that stringify() goes through for each key it encounters:</p>
<ol>
<li>If the value has a toJSON() method, call it.</li>
<li>If a replacer function has been given, call it.</li>
<li>If the remaining value is a scalar, return it.</li>
<li>If the remaining value is an object, stringify each member, then concatenate keys and values within braces {key:val,key:val}</li>
<li>If the remaining value is an array, stringify each element, then concatenate values within brackets [val,val,val]</li>
</ol>
<p>So, <code>stringify()</code> has already called Prototype&#8217;s <code>toJSON()</code> method by the time it executes the replacer function, but we can use the replacer function to restore the original value, allowing <code>stringify()</code> to then deal with the array by calling itself recursively.</p>
<p>The result is that we can ensure that even if a <code>toJSON()</code> method does exist on the Array object, its output is ignored, and we then get the JSON string that we wanted.</ol>
]]></content:encoded>
			<wfw:commentRss>http://assanka.net/content/tech/2009/09/02/json2-js-vs-prototype/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

