Tangled in the Threads

Jon Udell, December 13, 2000

JVM and CLR

Does JVM already deliver what .NET's CLR promises?

Chui Tey's survey of JVM integration projects shows that independent developers are working hard to make it so

In response to a column last month on the language-agnosticism of Microsoft's forthcoming Common Language Runtime, several people wrote to point out that the Java Virtual Machine already in many ways achieves this vision.

Steven Marcus:

The JavaVM is language agnostic.

There are many programming languages available for the JavaVM, including Lisp, Scheme, JavaScript, JPython, Prolog, and Eiffel. See http://grunge.cs.tu-berlin.de/vmlanguages.html .

There is no reason that a future JavaVM couldn't perform the same full translation to native code that the CLR does. Sun has been emphasizing JIT-compilation. This track may pay off with future HotSpot technology. It doesn't have to stay on that track though. The JavaVM on AS/400 compiles (and caches) .class files to native code on demand. Oracle has (promised?) a full native code compiler for Oracle8i -- but I believe that it compiles when you "install/deploy" the Java .class files. There are some commercial Java accelerators that do it as a deployment option on Windows. Also, GNU gcc has a Java backend that does it too.

There is a good amount of "metadata" available about objects available from the JavaVM. The Java Reflection API could be used to expose non-Java objects to a Java program.

Chui Tey:

An alternative to .NET is to use the JVM as the CLR, and then compile different languages to Java bytecodes. There is real value in this, although I'm not sure if Sun is up to this philosophically.

There's the rub. Since Java appeared, it has been clear that Java-the-language was, theoretically, just the preferred interface to the Java Virtual Machine, in the same way that C# is a preferred interface to the .NET Common Language Runtime. Yet Sun, itself, has never chosen to emphasize this point. It's been left to an intrepid band of independent developers to work out ways to integrate other languages with the JVM.

Chui Tey's JVM integration survey

Chui Tey wondered what has been learned from these efforts. So he mailed the following survey to a number of pioneers who've been working on alternate-language projects for JVM:

I would like to invite you to take a few minutes to let us know how "language neutral" you found the JVM. Some points that you may like to touch on are:

  1. Whether you had to omit certain language features to make it work.

  2. Whether the language could be debugged within the JPDA [Java Platform Debugger Architecture] framework at the source level (I note that some implementations involve translation into .java, which were then compiled into .class files). This could make debugging hard.

  3. Level of language interoperability. Were you able to build classes which could be subclassed from Java (perhaps subclassable from another implemented language on the JVM)?

  4. Lastly, but most importantly, whether you encountered any barriers, and what sort of changes to the JVM or perhaps even Java interpreter itself are required to make other languages equal first class languages.

Here are some of the responses that Chui received, and posted back to the newsgroup:

Juergen Neuhoff, Canterbury-Pascal for JVM and Canterbury-Oberon-2 for JVM, http://www.mhccorp.com/java.html

  1. [completeness] The only major limitation we came across was the GOTO statement in Pascal. In Canterbury-Pascal for JVM (which generates Java byte code directly) GOTO is limited the current function or procedure scope. This limitation is because of the Java byte code.

  2. [debugging] Both Canterbury-Pascal for JVM and Canterbury-Oberon-2 for JVM can generate Java byte code including the necessary source line number and local variable information and hence support almost any Java debugger even on the source level.

  3. [interoperability] All our compilers are capable of directly importing any foreign Java classes such as those from Sun's JDK 1.3 and they can be subclassed without any problems in Pascal, Modula-2 or Oberon-2.

  4. [barriers] Though it was no issue for Pascal or Oberon-2, some programming languages have the multiple inheritance feature. This does not exist for the JVM and would make porting languages which have multiple inheritance like C++ more difficult.

Per Bothner, Kawa, http://www.gnu.org/software/kawa/

Well, language neutrality was clearly not a JVM goal. One of the more gratuitous illustrations of that is how that method and field names are required by the verifier to conform to Java syntax. E.g.. you can't use "+" as a method name, even though "+" is a perfectly good (and standard) identifier in Lisp languages. The result is that identifiers have to be "mangled". No big deal, but it's a minor annoyance.

Speaking as the author of Kawa, which compiles Scheme, a good chunk of Emacs Lisp, and much smaller parts of CommonLisp and EcmaScript, directly into bytecode:

  1. [completeness] Some features are harder to implement, especially efficiently. I have not yet implemented full support for continuations (call/cc), though I'm working towards it. I do have full support for tail-call, but it is less efficient, and it is currently not the default. Both of these problems can possible be solved using a more clever compiler.

  2. [debugging] I see no reason why not, but I haven't had time to do much about it.

  3. [interoperability] Yes.

  4. [barriers] I don't know about "equal first class languages". I don't think there is any fundamental difficulty implementing all the features of (say) Scheme or Common Lisp. However, implementing them efficiently is another matter.

    There are some more general performance problems with the JVM, even for Java, which force less efficient or more convoluted solutions. For example, returning more than one value is a problem. C#/.NET supports non-heap-allocated abjects, so you can return a pair more efficiently, which is nice.

    If you have (or generate) a collection of classes, there is a lot of wasted space, due to symbol duplication, because each class requires its own constant pool.

Nik Boyd, Bistro, http://bistro.sourceforge.net/

  1. [completeness] Bistro omits just a few of Smalltalk's language features by choice rather than out of need to make it work on the JVM. For example, Bistro does not provide pool dictionaries. In Smalltalk, pool dictionaries serve as repositories for constants shared between classes. Bistro uses a better design / programming practice incorporated from Java. Shared constants may be included in type definitions (aka Java interfaces) or abstract classes. In Bistro, such shared constants always have an "owner", unlike the constants that reside in Smalltalk pool dictionaries.

    On the other hand, there are certain features provided by Java that made hosting Smalltalk feasible, especially *without* changes to the JVM. Java nested classes can be used to model Smalltalk metaclasses, anonymous inner classes can be used to model Smalltalk blocks, and the Java reflection facility makes it straightforward to implement dynamic method resolution. These were the features that especially made Bistro feasible. I would not attempt to port Bistro to the CLR without support for these features.

  2. [debugging] Bistro qualifies as an implementation that uses Java source code as an intermediate language. Again, this was by choice. I could have chosen to build a direct code generator. However, doing so would have required the implementation of a source-level debugger *before* Bistro could be released. But, building a source-level debugger is hard, even with JPDA. So, I made the choice to use Java source code as an intermediate language. This choice supports the ability to debug the code generated by the Bistro compiler with a commercially available debugger, e.g., Visual Cafe. It makes debugging Bistro feasible and no harder than debugging Java code.

  3. [interoperability] Bistro provides relatively seamless interoperability with Java as a result of many of the design choices I made. Bistro implements its classes and methods using Java classes and methods. Java classes may be derived from Bistro classes and Bistro classes may be derived from Java classes. Bistro classes may freely reuse Java libraries and Java classes may freely reuse Bistro libraries. Whether Bistro classes may be subclassed from another language will depend on how closely the other language uses the Java class model. However, given the other language has support for direct reuse of Java libraries, support for interoperability with Bistro should be straightforward.

  4. [barriers] From my point of view, there are no changes to the JVM that are *required*. However, if the JVM provided direct support for efficient dynamic method resolution, it would likely improve the performance of Smalltalk systems. There may be other features that would make Smalltalk implementations more efficient, but they will likely be pretty specific to Smalltalk.

Java-the-language, Java-the-platform

Thanks, Chui, for gathering these thoughtful responses! Clearly the JVM is capable of strong multi-language support, though not explicitly created for that purpose. Why wasn't language agnosticism a specific goal? Perhaps there was some hubris, on Sun's part, in thinking that Java (or indeed any single programming language) could be versatile enough to do everything. But timing was also a factor. As Mark Wilcox pointed out in the newsgroup, "the fast/furious rise of the Net...was one heck of a marketing opportunity" for Java. The emergence of Java as mobile, portable GUI really muddied the waters. While everyone kept trying to shine the spotlight on applets, it was servlets that proved the real strength of Java as a platform for network services. And in that role, a built-in framework for language integration, scripting, and components would have paid handsome dividends. People want to, and indeed have to, use a variety of languages to build network services. The JVM platform arguably doesn't address that need very well.

Mark Wilcox:

Python and Perl are already available on nearly as many platforms as Java, if not more so. Their interface to native C runtimes certainly gives them a speed edge (in particular File IO). Until Java gets Asynch IO, IO will continue to suck. You lose this ability when porting to Java, which is why Perl went the JPL route so that you could still use all of those modules out there with XS backends.

Alexander Staubo:

I personally care little about the JVM itself; I would be just as happy developing Java applications for the .NET runtime, especially if it provided better performance and/or better integration with other languages. So would, I think, many other developers.

Imagine plugging in an existing C library in your Java project -- as it stands, you end up having to choose between writing a JNI library (clumsy, even with SWIG) or reinventing the wheel in native Java (time- consuming and effort-duplicating).

Though I risk sounding like a Microsoftist here, the current Babel-esque lack of interoperability that divides languages is an impediment to productivity, innovation and true openness.

Every time I sit down with an idea I meet invisible barriers that shouldn't exist -- to write an application in language X that accesses the architecture Y, I need the bindings; without a language-specific binding, you can't use that language. We end up with an immense amount of duplicated effort: For every windowing/graphics framework (Qt, GTK, wxWindows, etc.) we have scripting-language bindings, for example.

.NET is a combination of a common intermediate language, a common runtime, and a common model that allows sharing of language metadata, the sort of magic that allows subclassing stuff written in other languages. Very ambitious, and while not mindblowingly innovative, collective sighs of "Ah! Finally!" are audible across the Net.

It is, of course, patently unfair to compare the reality of Java with the promise of CLR/.NET. There's plenty of time for Sun to respond to Microsoft's initiative. At the end of the day, it's all "just" programming. It will be fascinating to see if Sun now decides to sanction and support the language-agnosticism that was always a latent capability of its Java technology.

It's clear, in any case, that Java is a marvelous language in which to implement other languages. For object-oriented scripting languages such as Perl and Python, Java is in principle a much better fit than C, as Jim Hugunin argues in a paper on JPython. As the JVM languages page shows, Java/JVM has become a veritable research laboratory for programming languages.

I hope that .NET will provoke a reconsideration of the coupling between Java-the-language and Java-the-platform. Each of these facets of Java is, in its own right, impressively versatile. Their conjunction was partly an accident of history. If the combined entity creates "impediments to productivity, innovation and true openness" then people will find ways to route around them. What's at stake is much, much bigger than Microsoft vs. Sun parochialism. Software developers are the engine of the emerging network-services-based economy. They care about performance, language integration, and components. At the end of the day, they'll find ways to bend Java and .NET to serve these needs. For Java as for .NET, the winning strategy will be to offer the path of least resistance.


Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He's now an independent Web/Internet consultant, and is the author of Practical Internet Groupware, from O'Reilly and Associates. His recent BYTE.com columns are archived at http://www.byte.com/index/threads

Creative Commons License
This work is licensed under a Creative Commons License.