Tangled in the Threads

Jon Udell, February 4, 2002

What is general-purpose scripting?

Language and environment are the two axes of debate. Someday we may be able to separate these issues, but probably not anytime soon.

A friend wrote recently to ask my thoughts on [ECMA|Java|J]Script (which I will now, out of habit, refer to as JavaScript) as a general-purpose scripting language. He teaches software engineering courses and would like to use Python but, he says, there's pushback because people seem to long for a syntax like that of C or Java. Coincidentally I'd just been doing some work that prompted me to rethink my own assumptions about JavaScript's effective domain. Although I usually reach for JavaScript when I need to activate a Web page in some way, and generally rely on its special relationship to the browser and its document object model, I've also increasingly found myself running JavaScript from the command line using the cscript utility that's part of the Windows Scripting Host (WSH).

Here's an example. I work on a project that needs some ad-hoc reporting. The database is accessible at an authenticated URL, allows a subset of SQL to be passed on the URL-line, and returns results as HTML or XML. Queries have two general flavors: some count or sum matching records for a set of dateranges, others display them. The queries further subdivide into families; within each family a query has the same general shape, which varies along one or several dimensions. Typically, in this situation, I've used templates to express the query shapes, and Perl to interpolate values into variable slots and run the queries. But this time, I wanted to make the mechanism available to others without requiring them to have or use Perl, or deal with any CGI machinery. Why not use JavaScript to manage the query templating and query execution?

There was one obstacle. The queries that return counts of records are used interactively, as a kind of status dashboard. But there's also a need to record these numbers at regular intervals and save them for longitudinal analysis. This kind of batch process seems quite foreign to JavaScript's domain. But in fact, WSH enabled me to share more JavaScript logic between the two modes than I'd realized I could.

Both the interactive version and the batch version of this system work with lists of dynamically-generated URLs. A JavaScript associative array enumerates the functions that generate those URLs, like so:

var commands = new Object;

commands['countOfSomeEventSince30DaysAgo'] = 
        return factory( countOfSomeEvent, 
                [OP, '>=', DATE, '30daysago' ] ) 

commands['countOfSomeEventToday'] =
        return factory( countOfSomeEvent,
                [OP, '>=', DATE, 'today' ] ) 

One of the advantages of packaging the dynamically-generated URLs this way is that it gives the author of an HTML report an easy way to link to a drill-down. You could write, for example:

You can always check the current tally 
<a href="javascript:commands['countOfSomeEventToday'].fn()">here</a>.

The parameterization of the query is hidden from the HTML-based user, and handled by the factory() function. Its arguments are a query template and a list of name/value pairs. Here's the function:

function factory(url,args)
    var i = 0;
    while ( i < args.length )
        var key = args[i];
        var val = args[i+1];
        if ( key.match ('DATE') != null )
            { val = replaceDate(val) }
        var re = new RegExp(key);
        url = url.replace ( re, val );
        i += 2;
    return process(url);

The first argument is a JavaScript string representing the query template for a family of queries, with placeholders for slots needing interpolation. The second argument enumerates the slots and their values. When a slot is called 'DATE' its value is a label like 'today' or 'lastmonth'; the replaceDate() function uses JavaScript's very handy date library to map these labels to date strings.

How an URL is used depends on the context. In batch mode, the process() function just returns it as a string. In interactive mode, it fetches the URL into the browser.

function process(url)
    if ( isWScript() )
        { return (url) }
        { document.location.href = url; }

Here's the implementation of isWScript(), by the way.

function isWScript()
    { return ( typeof (WScript) == 'object' ) }

And here is the only piece of mainline code in the script, something you wouldn't normally see in interactive JavaScript:

if ( isWScript() ) 
    {   genUrls();  }

Finally, here's the genUrls() function:

function genUrls()
    for (command in commands)       
        { WScript.echo(command + ',' + commands[command].fn() ) }

genUrls() gets called only if isWscript() succeeds -- that is, if the script is invoked under the control of WSH. There are two ways to do that:

wscript query.js

 - or -

cscript query.js

The wscript variant, though it works, makes no sense in this situation; each line of output pops up in an alert window that has to be dismissed. The cscript variant, however, turns the script into a command-line-oriented component that you can pipeline.

Language and environment

Pipelining is, in fact, just what I end up doing with that list of URLs, because in batch mode the job's not done when the list is built. The URLs then have to be fetched, the results have to be stored, and the historical view of the saved datapoints has to be recranked through its own template to the web page where people expect to find it. The tool for that job wound up being Perl. Why? Although JavaScript can be decoupled from the browser, it isn't really what I'd call a general-purpose scripting language. It would be useful, of course, to define what that means.

We debate the merits of scripting language along two major axes: language, and environment. The relative importance of each of these varies according to individual taste. Python hackers regard Perl's syntax as line noise, and Perl hackers find Python's significant whitespace utterly weird. Some never cross the fence for these reasons. Others do, lured by environmental attractions such as Zope or CPAN.

Let's try out a definition of "general-purpose scripting language." The list of key desiderata might be:

Measured by these criteria, JavaScript seems most vulnerable on the last point. Classically it's wired to use the resources of the browser. If decoupled from that environment, it can't extend itself in the direction of self-sufficiency. Rather, it relies wholly on environmental services. On Windows, these include the COM services available via CreateObject and, much more dramatically, the entire .NET framework when JScript is used as a .NET language.

In languages like Perl and Python, there are multiple paths to self-sufficiency. On platforms that do not natively support (or do not allow access to) system services such as SMTP or LDAP, these languages happily roll their own implementations of the services. When services are available, these languages bind to them. This is wonderfully convenient, and accounts in no small measure for my perception that Perl and Python are general-purpose languages.

There is, of course, a cost associated with doing business this way. Small armies of developers in each camp toil endlessly to create and maintain the scripted implementations of services, and the bindings to system-level services, so that users of these languages can enjoy rich and complete environments. Because all this work must be repeated on a per-language and per-service basis, I once called this "a full employment act for open source programmers."

I was reminded again of this problem when I read the marvelous book Programming Ruby, by Dave Thomas and Andy Hunt. The authors characterize the language as "the Perl and Python of the new millenium." Rich Kilmer, who is contributing to an IDE for Ruby, enthusiastically concurs. To my eyes, as an observer but not yet a user of Ruby, it offers some nifty features both as a language and as an environment. On the language front, it puts blocks, closures, and iterators to powerful use, as Thomas and Hunt show in their book and also in a recent DDJ article. As an environment, it augments the usual stuff with some really interesting modules. Rich Kilmer raves most often about Ruby's tuplespace module, which implements a shared bulletin board (a la Linda or JavaSpaces) that can be accessed using complex patterns, and also about Ruby's RMI-like feature, drb (distributed Ruby) which makes it trivial to wire up networks of these tuplespaces.

Ideally a scripting language "for the next millenium" could focus on language and environmental features like these, without having to recreate a whole supporting infrastructure. In hopeful moments I imagine that the .NET framework or a successor to it, available in multiple implementations for multiple platforms, with commercial and open source variants, will be deployed widely enough so that this could happen. But that might turn out to be a mixed blessing. As Osvaldo Pinali Doederlein notes in One Runtime to Bind Them All, there is a sense in which Microsoft has, with the Common Language Runtime, "invented the concept of skinnable language." In other words, how much VBness, or Eiffelness, or JavaScriptness, is left once VB and Eiffel and JavaScript comply with the Common Language Specification and Common Type System? Perhaps the CLR will evolve in ways that give languages more leeway to innovate. Meanwhile I think scripting languages will continue to earn the label "general-purpose" by delivering innovative language features and, as the price of that freedom, fairly heavy life-support systems.

Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He is the author of Practical Internet Groupware, from O'Reilly and Associates. Jon now works as an independent Web/Internet consultant. His recent BYTE.com columns are archived at http://www.byte.com/tangled/

Creative Commons License
This work is licensed under a Creative Commons License.