Function EXpression Language

Home

Fexl version b13 released

I just released version b13 of the Fexl source code. This uses fwrite instead of write in string_put.c for better I/O synchronization. The test script is also slightly enhanced.

Current plans

Incidentally, I'm working on an entirely new version of Fexl which will support meta-programming and embedded evaluation. The code for this interpreter has no recursive routines — not even for parsing, or when a function like long_add needs to evaluate its arguments. Everything is done from a single flat evaluation loop which uses value nodes to maintain its own stack.

Currently the parser is written in C, and calls itself recursively to parse terms and expressions. In the new code, the parser will in effect be written in Fexl itself, and will evaluate in the normal evaluation loop just like any program written by the user. To resolve this chicken-and-egg scenario, the combinatorial form for the parser will be hardcoded into the C code, e.g. looking something like A(A(S,I), ...). I am writing the parser in Fexl now, and it will generate its own form automatically for me so I don't have to do it by hand.

I already have much of the infrastructure done. It all relies on the simple concept of a "tree", which is either a symbol, with a name and position (line number), or an application, which is the pairing of a left tree with a right tree. The constructors are:

# A tree is either (sym name pos) or (app fun arg).  The pos is used to report
# the position of any undefined symbols so you can find them easily in your
# source code and fix them.

\sym = (\name\pos \sym\app sym name pos)
\app = (\fun\arg  \sym\app app fun arg)

By convention, the parser puts an underscore "_" in front of each name it reads in the user's program. So for example, the names x, C, 4.8, or "hello" appear in the tree as symbols named _x, _C, _4.8, or _"hello".

In contrast, the parser also inserts built-in symbols itself for its own purposes, such as C, S, I, L, R, and Y. These symbols appear literally in the tree with those names, so that the parser can recognize them directly for pattern-matching purposes.

The parser uses this function to create named variables:

# A variable is a symbol whose name starts with "_".
\var = (\name sym (string_append "_" name))

And this function to detect variable names:

# Return true if the name starts with "_", indicating a variable name.
\is_var_name = (\name long_eq (string_ch name) (string_ch "_"))

Now here is the very important high-level function lam, which does lambda abstraction:

# The lam function abstracts x from f.  The result g is such that:
#
#    1. g contains no occurrences of x.
#    2. g x = f

\lam == (\x\f
    tree_eq x f I;
    f
    (\_\_ app C f)
    \fun\arg
    simplify (app (app S (lam x fun)) (lam x arg))
    )

After all variables are abstracted out, the parser calls resolve to convert the tree form into the corresponding actual evaluable function:

# Replace all symbols in tree f with their definitions in the environment env,
# and convert app trees into actual function applications.
#
# If all symbols are defined, it returns (yes value), where value is the actual
# evaluable function denoted by the tree.
#
# If a symbol is undefined, it returns (no name pos), which is the name and
# position of the first undefined symbol it encountered.

\resolve == (\env\f\yes\no
    f
    (\name\pos
        env name pos yes;
        no name pos
    )
    \fun\arg
    resolve env fun
        (\fun
            resolve env arg
            (\arg yes (fun arg))
            no
        )
        no
    )

So that's pretty much the whole parsing system at a high level, not including the actual part that reads the individual characters of the user's program, which I won't bore you with here.

But if you punch down a little further, you'll note some other interesting details. Note that lam calls simplify, which recognizes the various special forms for optimization:

\simplify == (\f
    \apply = (\x\y simplify; app x y)
    \match = (\p match p f) 

    # S (C x) (C y) = (C (x y))
    match (app (app S (app C x)) (app C y)) 
        (\x\y apply C (apply x y));

    # I x = x
    match (app I x) 
        (\x x);

    # C x y = x
    match (app (app C x) y) 
        (\x\y x);

    # S (C x) I = x
    match (app (app S (app C x)) I) 
        (\x x);

    # S (C x) y = R x y
    match (app (app S (app C x)) y) 
        (\x\y apply (apply R x) y);

    # S x (C y) = L x y
    match (app (app S x) (app C y)) 
        (\x\y apply (apply L x) y);

    # Y (C x) = x
    match (app Y (app C x)) 
        (\x x);

    # R I = I
    match (app R I) 
        I;

    # S C x = I
    match (app (app S C) x) 
        (\x I);

    # No rules apply.
    f
    )

That does some amazing things, especially given that it calls itself via that little helper function apply.

Now let's punch down a little deeper still. Note that simplify calls the match function, which matches a pattern p with form f. This is a spectacular bit of magic here, using high-order functions to accomplish something almost inconceivable in an ordinary imperative language. In effect, it binds any pattern variables to the corresponding matched subtrees, sort of Prolog-style:

# Match pattern p with tree f, collecting any matched variables.
\match == (\p\f\yes\no\collect
    p
    (\pn\_
        is_var_name pn (yes (collect f));
        f
        (\fn\_ string_eq pn fn (yes collect) no)
        \_\_ no
    )
    \p0\p1
        f
        (\_\_ no)
        \f0\f1
            match p0 f0
                (match p1 f1 yes no)
                no
                collect
    )

\match = (\p\f\yes\no match p f I no yes)

There are some other mildly interesting things, such as the environment which provides all the standard built-in functions during the resolve phase:

# This environment provides all the standard built-in functions.
\env_standard = (

    \name\pos\yes\no

    # Resolve string constants.
    string_unquote name yes;

    # Resolve long and double constants.
    # Minor inconvenience here:  the string_long and string_double functions
    # use \no\yes order instead of \yes\no, so I have to swap the order.

    (\no string_long name no yes);
    (\no string_double name no yes);

    # Resolve the individual function names.

    \case = (string_eq name)

    case "?" (yes ?);

    case "item" (yes item);
    case "pair" (yes pair);

    case "long_add" (yes long_add);
    case "long_sub" (yes long_sub);
    case "long_mul" (yes long_mul);
    case "long_div" (yes long_div);
    case "long_compare" (yes long_compare);
    case "long_type" (yes long_type);
    case "long_string" (yes long_string);
    case "long_char" (yes long_char);
    case "long_double" (yes long_double);

    case "double_add" (yes double_add);
    case "double_sub" (yes double_sub);
    case "double_mul" (yes double_mul);
    case "double_div" (yes double_div);
    case "double_compare" (yes double_compare);
    case "double_type" (yes double_type);
    case "double_string" (yes double_string);
    case "double_long" (yes double_long);

    case "string_append" (yes string_append);
    case "string_compare" (yes string_compare);
    case "string_put" (yes string_put);
    case "string_at" (yes string_at);
    case "string_len" (yes string_len);
    case "string_slice" (yes string_slice);
    case "string_common" (yes string_common);
    case "string_type" (yes string_type);
    case "string_long" (yes string_long);
    case "string_double" (yes string_double);

    case "char_get" (yes char_get);
    case "char_put" (yes char_put);

    # Resolve the basic combinator names used in the parser itself.

    env_basic name pos yes;
    no
    )

Wed 2011-08-31

Previously

Fexl version b12 released