HTML Post Loading and Processing Resources using AJAX – Part 3: multiple, dependent resources and custom processing

AJAX opens up a lot of possibilities – that in truth by and large
already existed pre-AJAX. In a series of articles, I am discussing the
concept of post loading resources: after the HTML document is loaded,
additional resources can be retrieved from various sources – both local
and remote – and processed in a plethora of different ways. The concept
of Post Loading Resources was introduced in the first installment Ajax-based Post Loading of resources in HTML pages – for reuse of resources and fast user feedback.
The second article in the series, HTML Post Loading Resources Framework (AJAX Based) – Part 2 – Loading and pasting simple content, presented a
simple JavaScript library that implemented the basics of this concept.
It loaded multiple resources, both local and remote, and processed them
by pasting them into the value property of Forms items or the InnerHtml
property of Container elements such as DIV and TD.

This
article introduces a little more complexity: how to specify a custom
post processor for a Post Loaded Resource – for example to execute a
post loaded piece of Java Script or to scrape HTML content from a
remote Web Page – and how to define dependencies between Post Loaded
Resources to ensure that processing only takes place when all required
resources are available. The source code for this article can be
downloaded in the Resources section.

Defining a Custom Processor

Allowing
custom processors for Post Loaded Resources is extremely simple.
Thusfar, we had built into our framework a standard processor function
processPostLoad() called from the ContentLoader object when the
Repsonse was successfully received. We will not change this, but we
will add a new property processor to the PostLoadResource
object and extend the processPostLoad() function – or actually its
helper startProcessing() – to deal with this property.

First a change in the instantiation of the PLR object:

amis.PostLoadResource=function           
                   ( elementId
                   , url
                   , requireProxy // boolean indicating whether or not the resource must be acquired through a proxy from a remote domain
                   , label
                   , processor // function reference of function that will process the resource when received
                   , dependsOn // other PostLoadResources this PLR may depend on, such as an XSLT that an XML depends on for being processed
                   ) {
  this.elementId=elementId;
  this.url=url;
  this.requireProxy = requireProxy;
  this.state = amis.STATE_UNINITIALIZED;
  this.label = label;
  this.req = null;
  this.id = null;
  this.processor= processor;
}

  // create a new PLR object and add it to the array of PLR objects to be dealt with when the page has loaded
  function addPostLoadResource
           ( elementId
           , url
           , requireProxy // boolean indicating whether or not the resource must be acquired through a proxy from a remote domain
           , label
           , processor // function reference of function that will process the resource when received
           ) {
     var plr = new amis.PostLoadResource(elementId, url, requireProxy,label, processor);
     var size = postLoadResources.push( plr ); // add a new PostLoadResource object to the array
     plr.id = size -1; // ensure the plr objects knows where it sits in the postLoadResources array
     return plr;
  }


The second change that we need for a custom processor is in the startProcessing() function:

  function startProcessing(postLoadResource) {
    if (postLoadResource.processor) 
    { 
      postLoadResource.processor.call(postLoadResource);
      }
    else {
      if (postLoadResource.elementId) {
        // go find element and load contents
        var element = el(postLoadResource.elementId);
        if (element.value) {
          element.value = postLoadResource.req.responseText;
        }
        else {
          element.innerHTML = postLoadResource.req.responseText;
        }
      }
    }// processor
    postLoadResource.state = amis.STATE_PROCESSED;
  }// startProcessing

It is a simple change: the function checks whether the processor
property has a value. If so, it is assumed to be function reference and
that function is invoked, with the current postLoadResource object as
context. The latter means that the custom processor can refer to the
PLR object through the this reference. We see how that is done in this
sample custom processor; it takes the responseXml object, retrieves all
ROW/DEPARTMENT elements and prints them as list items to the DIV
container:

    function customDeptProcessor() {
    var postLoadResource = this;
    var depts = postLoadResource.req.responseXML.getElementsByTagName("ROW");
    var newContent ="<H3>The list of Departments</H3><ul>";
    for (var i=0; i<depts.length; i++) {
      var dept = depts[i];
      var dname = dept.getElementsByTagName("DNAME")[0].firstChild.nodeValue;
      var loc = dept.getElementsByTagName("LOC")[0].firstChild.nodeValue;
      newContent = newContent+"<li>"+dname+" (in "+loc+")</li>";
    }//for

    el(postLoadResource.elementId).innerHTML=newContent+"</ul>";
  }// customDeptProcessor

The XML Resource dept.xml that is processed here looks like:

HTML Post Loading and Processing Resources using AJAX - Part 3: multiple, dependent resources and custom processing plrDeptXML
To
instruct the PLR library to post load and process the dept.xml
resource, the following snippet is required in the HTML document:

    <DIV style="background-color :yellow">
      <pre id="PL1">
        <Script language="JavaScript">    
          addPostLoadResource('PL1', 'dept.xml', false, 'POSTLOAD_DEPARTMENTS',customDeptProcessor);
        </Script>
      </pre>
    </DIV>

The result of this custom processor is shown below, in the yellow box.

HTML Post Loading and Processing Resources using AJAX - Part 3: multiple, dependent resources and custom processing

 

Example of Custom Post Processor: HTML Scraping

The second container in the screenshot above displays the actual headlines from the popular website www.theserverside.com.
Since these headlines are also available as an RSS feed, the example is
slightly contrived, but instructive nevertheless. In this case, we have
specified http://www.theserverside.com/index.tss
as our resource-to-be-post-loaded. We have also setup a custom
processor that knows the (current) structure of the home page on TSS
and is capable of getting the headlines out of the HTML page and into a
piece of our own HTML that we paste in the innerHtml property of our
DIV. This is a fairly simple way of putting together a Portal on the
world: just grab and scrape interesting portions of good websites. On a
moral note: you are using their content without absorbing their
commercial messages, which can perhaps be seen as a form of cheating or
worse. That is for your own conscience to decide.

The custom processor that takes the HTML from the TSS and ‘srcapes’ it, looks like this:

  function customProcessTSS() {
    var postLoadResource = this;
    // look for the first 10 H1 elements
    var pos=-1;
    var endpos=-1;
    var i=0;
    var newHtml="";
    for (i=0;i<10;i++) {
      pos = postLoadResource.req.responseText.indexOf('<h1>', pos+1);
      endpos = postLoadResource.req.responseText.indexOf('</h1>', pos+1);
      newHtml = newHtml + postLoadResource.req.responseText. substring(pos, endpos+5);
      // replace local, relative reference within TSS with full URL path
      newHtml= newHtml.replace('<a href="/news/','<a target="_blank" href="http://www.theserverside.com/news/');
    }//for
    el(postLoadResource.elementId).innerHTML='<H2>TSS New Entries</h2>'+newHtml;
  }// customProcessTSS

It
is fairly basic and not overly robust. It depends on the fact that the
Headlines on TSS are wrapped inside H1 elements – and no other
information on the page is. So it just locates an H1 element, finds the
associated </H1> tag and uses everything in between as the
content to scrape. It turns out that the references to the actual news
stories are local references, not starting with http, but starting with
a relative reference /news/. In order to maintain the reference, we
have to change the relative reference into an absolute one, pointing to
the website of TSS. Finally, we get hold of the target element
specified in the PLR object and paste the list of scraped headlines
into it.

To include these post loaded headlines in our HTML document, we include the following fragment:

    <h3>Download the frontpage of The Server Side and extract the headlines</h3>
    <Script language="JavaScript">    
      addPostLoadResource('PL2', 'www.theserverside.com/index.tss', true,  'POSTLOAD_TSS_LOADER', customProcessTSS);
    </Script>
    <DIV id="PL2" >
    </DIV>

Processing Post Loaded JavaScript Resources

One
of the ways in which a post loaded resource can be processed is by
executing it. The JavaScript eval() operator allows us to pass in a
string – or the responseText for a post loaded resource – and have it
executed. This allows us to add additional JavaScript functions and
data structures to our HTML document after it has been loaded. In an
earlier post I mused over ways to work with large volumes of data
cached on the client machine in ‘static’ JavaScript libraries in
combination with lightweight cache updates. Being able to post load a
JavaScript resource and execute it on the fly is probably the answer to
that challenge. See Increasing speed of making data available in the web-page – on AJAX, controlling data caching and JavaScript Libraries.

A very simple example of a custom processor that processes the Post Loaded Resoiurce as JavaScript would be something like:

  // this function treats the req.responseText as a piece of JavaScript
  // the responseText is evaluated : this can do several things like
  // creating new functions, adding data , calling already pre-loaded
  // JavaScript objects; this shows we can pull in additional JavaScript
  // libraries in postload operations - from other domains as well as 
  // our own!!
  function customJSProcessPostLoad() {
    var postLoadResource = this;
    eval(postLoadResource.req.responseText);
  }

Dealing with dependencies between Post Loaded Resources

The
last challenge we will tackle in this article is how to link PLR
objects to one another. It happens regularly that a certain resource
can only be processed when another resource is available. For example
the processing of Resource A must be done by a JavaScript function that
will be loaded as part of Resource B. Or Resource A is and XML document
and Resource B the XSLT stylesheet that is needed to transform it (see
next installment for XSLT transformations). So we need a way to tell
the PLR framework that a resource can be loaded alright but can only be
processed when resources it depends upon have been loaded. We will
implement this following a well known Design Pattern: the Observer.

First
we extend the Post Load Resource object with two properties: an Array
with PLR objects the object depends upon (the observed) and an Array
with PLR objects that depend on it (the observers).

/*--- content loader object for cross-browser requests ---*/
amis.PostLoadResource=function           
                   ( elementId
                   , url
                   , requireProxy // boolean indicating whether or not the resource must be acquired through a proxy from a remote domain
                   , label
                   , processor // function reference of function that will process the resource when received
                   , dependsOn // other PostLoadResources this PLR may depend on, such as an XSLT that an XML depends on for being processed
                   ) {
  this.elementId=elementId;
  this.url=url;
  this.requireProxy = requireProxy;
  this.state = amis.STATE_UNINITIALIZED;
  this.label = label;
  this.req = null;
  this.id = null;
  this.processor= processor;
  this.dependsOn = dependsOn;
  this.dependents = new Array();
  // also tell every PLR that is depended on to notify this PLR when they are processed
  if (dependsOn) {
    var i=0;
    for (i=0;i<dependsOn.length;i++) {
      dependsOn[i].dependents.push(this);
    }//for
  } // if dependsOn
}

  // create a new PLR object and add it to the array of PLR objects to be dealt with when the page has loaded
  function addPostLoadResource
           ( elementId
           , url
           , requireProxy // boolean indicating whether or not the resource must be acquired through a proxy from a remote domain
           , label
           , processor // function reference of function that will process the resource when received
           , dependsOn // an Array of PLR objects on which the newly added PLR depends
           ) {
     var plr = new amis.PostLoadResource(elementId, url, requireProxy,label, processor, dependsOn);
     var size = postLoadResources.push( plr ); // add a new PostLoadResource object to the array
     plr.id = size -1; // ensure the plr objects knows where it sits in the postLoadResources array
     return plr;
  }

The
depensOn parameter holds an Array that contains PLR objects on which
the new Post Load Resource depends. During creation of the PLR object,
this array is iterated over and every PLR on which a dependency exists
is informed of the fact that the new PLR depends on – listens to or
observes – the PLR it depends on. Note that it is very low level,
unencapsulated way of registering listeners or observers on an
Observable. Here is definitely some room for refactoring.

The
required logic is created in the processPostLoad() function that is
invoked whenever the response for a PLR is succesfully received by the
ContentLoader object. This function will no longer automatically start
processing the PLR, it will check first whether it has any observers –
any PLRs that depend on it. If so, these dependent PLRs are notified of
the fact that this PLR is loaded – and perhaps they can now start
processing themselves. Next, processPostLoad() looks if the PLR has any
dependencies itself on other PLRs and if so it will verify whether
those PLRs are already in. Only when they are all loaded can processing
of this PLR proceed. If not all dependencies are satsified, this PLR is
not processed but put on hold – state = amis.STATE_LOADED_AND_WAITING.
The processPostLoad() function used the dependsOnUnloaded() helper to
find out whether any dependencies exist on PLRs that have not yet been
loaded.

  // this function checks whether the PLR has dependencies on other PLRs
  // that are not yet loaded
  function dependsOnUnloaded( postLoadResource) {
    // check if the postLoadResource depends on other plrs that are not yet loaded
    // if so, it goes into the waiting room - state= amis.STATE_LOADED_AND_WAITING
    if (postLoadResource.dependsOn) {
      var i=0;
      for (i=0;i<postLoadResource.dependsOn.length;i++) {
        if (postLoadResource.dependsOn[i].state <  amis.STATE_LOADED_AND_PROCESSING) {
          return true;
        }
      }//for
    } // if dependsOn
    return false;
  }//dependsOnUnloaded


  // This function is called by the ContentLoader when the AjaxRequest is successfully completed
  // and the requested content is available for further processing.
  function processPostLoad() {
    var postLoadResource = this.parameter;
    postLoadResource.req = this.req;
    postLoadResource.state = amis.STATE_LOADED_AND_PROCESSING;
    // now tell all dependents that this plr is being processed
    if (postLoadResource.dependents) {
      var i=0;
      for (i=0;i<postLoadResource.dependents.length;i++) {
        postLoadResource.dependents[i].notify(postLoadResource);
      }//for
    } // if dependents
    // check if the postLoadResource depends on other plrs that are not yet processed
    // if so, it goes into the waiting room - state= amis.STATE_LOADED_AND_WAITING
    if (dependsOnUnloaded( postLoadResource)) { 
       postLoadResource.state = amis.STATE_LOADED_AND_WAITING;
       return;
    }
    startProcessing( postLoadResource);
  }// processPostLoad

The notify function that is invoked by processPostLoad is defined on the PLR object, on the prototype itself:

 // this function is to be called for a PLR that depends on another PLR when
// that other PLR has been loaded so it (the listener or observer) can evaluate 
// whether all its dependencies are available and so it can start processing itself
amis.PostLoadResource.prototype.notify=function(plr){
  if (this.state == amis.STATE_LOADED_AND_WAITING) {
    startProcessing(this);
  }
}

The
only piece missing is an example of the JavaScript call we will now use
to specify dependencies between Post Load Resources. The third example
in the screenshot above shows two Select elements, with data on
Departments and Employees. However, the HTML document only contains
this snippet:

     <h3>Postload an HTML fragment (creating select items) and a dependent JS library (that loads the data)</h3>
      <Script language="JavaScript">    
        var empdeptlistsPLR = addPostLoadResource('PL_9', 'EmpDeptLists.txt', false,  'POSTLOAD_EMPDEPTLISTS_LOADER');
        var jsempdeptPLR = addPostLoadResource(null, 'EmpDeptListManager.js', false, 'JS_FOR_EMPDEPT_LOADER', customJSProcessPostLoad,  new Array(empdeptlistsPLR));  
     </Script>
    <DIV id="PL_9" style="background-color :pink">
    </DIV>

The
SELECT elements are post loaded from a local resource – a text file –
that holds some text that we can paste as HTML into the DIVs innerHtml
property. The data is post loaded from XML files, dept.xml and
emp_<deptno>.xml. However, the JavaScript that post loads the
data for the SELECT elements must first be post loaded itself. What we
see in the HTML snippet above is that two resources are post loaded:
the EmpDeptLists.txt resource, that contains a bit of HTML setting up
the SELECT items, and the EmpDeptListManager.js resource which is a
JavaScript library that will go on to post load the department data and
populate SELECT element with it; it also defined the onChange event
handler on the Department list that will populate the Employees list
based on the Department chosen. Before this JavaScript library can do
its work, the HTML elements on which it acts must be available. So we
need to make sure that the jsempdeptPLR is only processed when the
empdeptlistsPLR has been loaded. We do this by including the new
Array(empdeptlistsPLR) in our call to addPostLoadResource. The
dependency facilities that we added to the PLR framework above take
care of only processing the JavaScript in EmpDeptListManager.js when
the empdeptlistsPLR is in. Note: we have a weak spot here: in this
particular case, it is not good enough that the empdeptlistsPLR is in:
it must have been processed as well! Another item on the TODO list.

Turning
our attention to the contents of the JavaScript library that is loaded
in. It contains a number of functions that will help process the
Department data that we will load, react to another selected Department
by loading the corresponding Employee data into the Employees List etc.
It also contains some JavaScript statements that are immediately
executed:

  // execute immediately after loading this piece of JavaScript
  // find the dept element (a select) and add an onChange event handler: the function loadEmployees
  document.getElementById("dept").onchange=loadEmployees;

  // create a new PostLoadResource object for the dept.xml document with department data
  var deptPLR = addPostLoadResource(null, 'dept.xml', false, 'POSTLOAD_EMPDEPT_LOADER', deptProcessor);
  // kick it off to get the Departments asap
  postLoadSingleResource(deptPLR);

First
we get hold of the dept item, the Department SELECT that is created
when the PLR EmpDeptLists.txt was loaded and processed. Then we assign
to this item an onChange event handler. The handler is a function
called loadEmployees(); it is also defined in this library; when a new
Department is selected, loadEmployees() is invoked. It creates a new
PLR object and has it processed by the PLR framework:

  function loadEmployees()  {
    var deptList = document.getElementById("dept");	
    clearList(document.getElementById("emp"));
    // for each department there is a file called emp_<number of department> to hold the employees
    var file = "emp_"+ deptList.value +".xml"
    // define a new PostLoadResource
    var empPLR = addPostLoadResource(null, file, false, 'POSTLOAD_EMP_LOADER', empProcessor);
    // and immediately kick it off
    postLoadSingleResource(empPLR);
  }

We create a new PLR object for the dept.xml resource that contains the
Department data that we want to use for populating the Department list.
Since the HTML document is already loaded and therefore the onLoad
event that called the goPostLoadResources() function will not take
place again, we have to start processing of this new PLR ourselves,
hence the call to postLoadSingleResource. We specify a custom processor
called deptProcessor. This is a reference to a function that is loaded
as part of this JavaScript library:

  /**
   * Populate the list with the data from the request
   * (Could be done in a generic manner depending of the XML...)
   */
  function deptProcessor() { 
    var postLoadResource = this;
    var list = document.getElementById("dept");	
    clearList(list);
    addElementToList(list, "--", "Choose a Department" );
    var items = postLoadResource.req.responseXML.getElementsByTagName("ROW");
    for (var i=0; i<items.length; i++) {
      var node = items[i];
      var deptno = node.getElementsByTagName("DEPTNO")[0].firstChild.nodeValue;
      var dname = node.getElementsByTagName("DNAME")[0].firstChild.nodeValue;
      var loc = node.getElementsByTagName("LOC")[0].firstChild.nodeValue;
      addElementToList(list, deptno, dname  );
    }//for
  }// deptProcessor

  /**
   * remove the content of te list
   */
  function clearList(list) {
  	while (list.length > 0)
  	{
      list.remove(0);
    }
  }

  /**
   * Add a new element to a selection list
   */
  function addElementToList(list, value, label)
  {
    var option = document.createElement("option");
    option.value = value;
    var labelNode = document.createTextNode(label);
    option.appendChild(labelNode   );
    list.appendChild(option);
  }

HTML Post Loading and Processing Resources using AJAX - Part 3: multiple, dependent resources and custom processing plr2select1

When
the deptProcessor has done its job, the Department Select is populated
with Department data. When we pick a Department, the onChange event is
fired, invoking the loadEmployees() function that creates a new PLR to
load the relevant employee data that will be processed by the
empProcessor() when received. The empProcess() populates the Employee
List:

HTML Post Loading and Processing Resources using AJAX - Part 3: multiple, dependent resources and custom processing plr2select2

One
last element is to be discussed: the empProcessor function that is
called to process the results when the empPLR that was created when a
new Department was selected is ready for for action. It is quite
similar to the deptProcessor function.

  function empProcessor() {
    var postLoadResource = this;
    var list = document.getElementById("emp");	
    var items = postLoadResource.req.responseXML.getElementsByTagName("ROW");
    clearList(list);
    if (items.length > 0) {
      for (var i=0; i<items.length; i++) {
        var node = items[i];
        var empno = node.getElementsByTagName("EMPNO")[0].firstChild.nodeValue;
        var ename = node.getElementsByTagName("ENAME")[0].firstChild.nodeValue;
        addElementToList(list, empno, ename  );
      }//for
    }
    else
    {
     alert("No Employee for this department");
    }
  }// empProcessor

If we select another Department, new data is loaded and used to populate the list:

HTML Post Loading and Processing Resources using AJAX - Part 3: multiple, dependent resources and custom processing pl2select3

Resources

Download the sources for this article PostLoadResourceDemo2.zip

The installments in this series are