JLCA 3.0 - "Java Language What?"

This probably should come under the Third Base: "I dunno" category. Recently I've been playing with various kinds of content "generators" and Wikipedia came into the crosshairs. Wikipedia has a policy that you can reproduce their content, and a substantial portion of their content is actually very very good and well-researched. There are over 130 listed sites that reproduce Wikipedia content in one form or another, some giving proper attribution, and many not even bothering. Answers.com is one of the biggest, and they do a nice job of it.

The problem is, if you do a Wikipedia title search and get the results back as xml (which they offer) it has a content node filled with that God-awful Mediawiki markup. At that point you have to find a way to convert it to displayable HTML, or it's not going to look very pretty. To the best of my knowledge, nobody has written a "Wiki2HTML" parser in C#.

So, in keeping with my smart developer philosophy of "don't reinvent the wheel", I looked around for some conversion apps - any language, thank you! There are a few very good ones; in fact about the best one is actually written in client-side Javascript. Another good one in JAVA. Some in Ruby, PHP, Python, Perl. Boo, anyone? Halloween is over. But, no C#.

Now, JScript.Net is no easy task for a C# developer. Once you have script that starts using prototype and function JScript has no idea what to do with it - at least not "out of the box". So, you'd need to be a real Javascript expert - I mean GURU level expert (and I am not) to convert it.

The next thing we tried is the JAVA .java class files. Did you know that the Microsoft JAVA Language Conversion Assistant 3.0, which is built into Visual Studio 2005, will load an entire folder full of these babies and happily convert them to C#?

Yes, it will. It even does JSP. Nevermind that it makes a struct with static readonly fields as constants instead of an enum - that kind of stuff you can fix. But when you get into some of these wild-ass Visitor patterns, well --. Let me just say this: It's one thing to get the code to compile. It's a whole other ballgame to get it to WORK! And, I don't think it's so much the differences in the languages, which aren't that great. It's those dang Patterns those JAVA D00ds use! The poor conversion assistant starts recursing and ends up with it's head stuck up its butt!

I think I'll just stick with the client-side javascript and attach the output to a div tag for now!

Comments

  1. Anonymous3:06 AM

    Peter,

    Assuming that 'Mediawiki' markup is valid XML, why don't you use XSLT? When it comes to converting XML into another xml based format, I wouldn't consider using anything else - once you're past the learning curve, it's so must faster, tidier, maintainable and enjoyable (assuming it's done properly with template matching rather than xsl:for-each's). Of course, finding an existing xslt for the job would be even better...

    Andrew

    ReplyDelete
  2. Andrew,
    MediaWiki markup is about as far removed from well-formed XML as a hamburger is on a vegetarian's dinner table.
    Cheers.

    ReplyDelete
  3. Anonymous11:42 AM

    Hi, Peter, I read your article at eggheadcafe about how to share sessions between classic ASP and ASP.NET pages. I gave it a shot and it works great. But the problem is that it looks like this strategy only works WITHIN the same application in which you have both classic ASP and ASP.NET pages. I tried this strategy in sharing sessions between a classic ASP application and another application completely in ASP.NET, and it does not work. The sessions could not be delivered to the ASP.NET application. Do you have any trick that will handle this problem? Thanks a lot! Antony Liu (antonyliu2002@yahoo.com)

    ReplyDelete
  4. Antony,
    you can post your question on the forums at eggheadcafe.com and plenty of very smart people will be there to read and respond. I always answer there, too.

    ReplyDelete

Post a Comment

Popular posts from this blog

Some observations on Script Callbacks, "AJAX", "ATLAS" "AHAB" and where it's all going.

IE7 - Vista: "Internet Explorer has stopped Working"

FIREFOX / IE Word-Wrap, Word-Break, TABLES FIX

System.Web.Caching.Cache, HttpRuntime.Cache, and IIS Recycles

FIX: Requested Registry Access is not allowed (Visual Studio 2008)