Inhumane text generation

So, I have been bitten by the bug to get textile.el working again. Hopefully the bug won’t kill me.


Textile, for those who don’t know, is a “Humane Web Text Generator” originally designed and programmed by Dean Allen. He’s a smart guy. Brad Choate came along and seriously expanded it, as well as porting it to Perl for use in Movable Type. He’s a smart guy too.

Me, not so smart. I loved Textile, but more than that, I loved the challenge to process text on this level, and beyond that, I loved the idea of doing it in Emacs. So, over four years ago, I started working on it.

The amazing thing is that the version I abandoned almost entirely works. There are some bugs (and a couple of them are nasty ones), but the output is not bad, and quite usable for the majority of people.

The not-so-amazing thing is that even my second attempt at this algorithm is among the most dreadful things I have ever written. The algorithm got a lot smarter the second time around, but that doesn’t make it good. Wow, it’s bad. I mean, I can hardly follow code that looks like this (skip it if you don’t want to bother):


(defun textile-unextend-blocks (my-list)
  "In a list of textile trees, pull extended blocks together."
  ; FIXME - this doesn't seem to pass testcases.txt line 285
  (let ((new-list nil))
    (while my-list
      (let* ((this-item (car my-list))
             (my-attr (car (last this-item))))
        (setq my-list (cdr my-list))
        (if (plist-get my-attr 'textile-extended)
            (while (and my-list
                        (not (plist-get (car (last (car my-list)))
                                        'textile-explicit)))
              (textile-append-block this-item (car my-list))
              (setq my-list (cdr my-list))))
        (push this-item new-list)))
    (setq new-list (reverse new-list))
    (while new-list
      (let* ((this-item (car new-list))
             (my-attr (car (last this-item)))
             (clear-info (plist-get my-attr 'next-block-clear)))
        (setq new-list (cdr new-list))
        (if clear-info
            (if new-list
                (progn
                  (setq this-item (reverse (car new-list)))
                  (setq my-attr (plist-put (car this-item) 'style
                                           (concat (plist-get (car this-item)
                                                              'style)
                                                   clear-info)))
                  (setcar this-item my-attr)
                  (push (reverse this-item) my-list)
                  (setq new-list (cdr new-list))))
          (push this-item my-list))))
    (reverse my-list)))

So, let’s see, push the reverse onto my-list, then reverse it again. Is this all because I’m using lists backwards, trying to force them to be arrays? Most likely. This happened to me when I was writing a sudoku solver (also abandoned, but it mostly works too). Boy, the 80/20 rule applies so heavily to me!

Anyway, point is, I don’t think I can pick this project up again. It’s just too horrible to think about getting my head into nearly 1900 lines of spaghetti emacs-lisp like this. And that means, since I have indeed been bitten by the bug but can’t stand the thought of picking up where I left off, I’m going to have to start over.

First things first: I must recognize that there is good code in this mess. Some of those regular expressions are worth their weight in gold, given the hours I invested in crafting them. How do you know how to apostrophize things like ’80s and 5′7″ and things like that? At some point, I figured it out. No point in reinventing that wheel. Also, I had a pretty good system for figuring out exactly how Unicode-capable this particular Emacs incarnation was (there were at least three different systems between Emacs 20+Mule-UCS and Emacs 21.4). And the tokenization processes work although they are hard to read. I have to hang on to those things.

Second, I either need to get my head back into how CVS works, or switch with the rest of the world to something that makes sense. I may have to switch. CVS works but I really hate it.

Third, I think I need to do some pseudocode before I start this time. My plan is to use Emacs’ strengths rather than try to write C or Perl in my head and then port it to emacs-lisp. This is going to happen in a buffer this time, and I’m going to come up with some way to parse that buffer sequentially rather than running fifteen passes through various (while (re-search-forward ... loops that each have a million exceptions for things that may happen inside them that are out of order. It may come down to a gigantic regular expression that contains every Textile-ism that could come up, and then it re-search-forward’s once with that. I hope not but one never knows.

Why do I get so excited about things that don’t matter at all? Does anybody even use Textile anymore, now that Wordpress and Blogger and probably every other major blogging platform have more or less WYSIWYG editor options?

I guess because it’s a challenge. It was fun to write an outline conversion tool this weekend. It’s just my kind of challenge. Every time I do something like this, I learn more about programming in general. I get to do something I’m not often allowed to do these days, and that’s to live in Emacs for a while.

And when I get my new laptop that might actually last for several hours on a charge, maybe I can even do this on the bus. That sounds like fun too. I have to keep my mind sharp. It’s too easy to just let things wilt now.

And someday I still want to write a static blog generation tool in Emacs. Yes, mostly because it’s there.

4 Comments

  1. Gummby (66 comments.)
    Posted 8/11/2008 at 2:07 pm | Permalink

    I have an upcoming post about blogging where I ask about tools being used. You better comment.

    What I found interesting about this is that I pretty much use ASCII text for all of my rough drafts of posts. I’ve tried lots of other stuff, but ended up going back to writing the text, and then fiddling with the markup.

  2. Gummby (66 comments.)
    Posted 8/11/2008 at 2:08 pm | Permalink

    I guess that should probably be “you’d better comment.”

  3. Posted 8/11/2008 at 2:11 pm | Permalink

    That’s fine, colloquialism is acceptable. I’d even take “ya’ll best comment.” Except that some might see “ya’ll” as possessive rather than nominative.

    That said, I will try to remember to comment.

  4. CrazyManAndy (1 comments.)
    Posted 8/13/2008 at 8:03 pm | Permalink

    Actually, I believe the word is “y’all”. ;)

    CMA

Bad Behavior has blocked 317 access attempts in the last 7 days.