Syntax

Many programmers when seeing some Self code fragment published in some paper find it impossible to believe that this is just a dialect of the Smalltalk language: "no way! This looks like LISP!!" they say. Obviously they are not too familiar with Smalltalk-72 or -74.

The problem is that there are several different elements in those listings that should be considered separately. This is important to understand how Neo Smalltalk is different from Self 4.

Message Passing

Self has "message passing at the bottom". That is, everything that happens outside an object as well as inside can be explained in terms of messages being sent. So a key aspect in the language is a way to describe these messages.

Rule 1: A message is a sequence of any characters (except for the period) with one or more underlined subexpressions. Any white space characters are simply ignored

This is compatible with both Smalltalk-80 and Self. We can have an expression like

  100factorial.

which is like a unary message in these languages. The "100" is being shown here to make the examples simpler in a Swiki - see below how literal objects are actually handled. The final character, the period, is not part of the message.

Both binary and keyword messages from Smalltalk can be written using rule 1

   3+4.
   3between:1and:9.
   3between:1And:9.

The second line above follows Smalltalk rules, while the third line follows Self rules. Since precedence is always explicitly indicated via the underlining, we can mix both styles in the same code without any conflicts.

All expressions have at least one level of underlining (though it might be invisible due to rule 2 below) and it is easy to imagine that two or three levels of underlining would look reasonably clear. But what about more levels? Here are the number of (top level) expressions in the Self 4.1.6 system for each level of underline nesting:

levels	number of expressions
1	3647
2	13716
3	7193
4	2264
5	730
6	235
7	89
8	34
9	14
10	15
11	3
12	1
13	4
14	4
16	2
17	1
19	6
21	2
22	1
23	1
24	1
27	1

Things are better than the table seems to imply, since only 30 methods are responsible for all expressions with a nesting level of 9 or more, which is when things really become awkward. They are all methods with complex object construction expressions and could be written differently.

Deeply nested underlines are also less of a problem than they seem on a first impression since you only have to notice when a new level starts and ends and visually match that with other fragments with the same nesting level. The absolute number of levels doesn't matter at all, so there is no need to count them.

All examples so far have stressed compatibility with previous Smalltalks, meaning that old code can easily be typed in from books or (with a special parser) read from a file. But rule 1 is far more flexible than that. Control structures, for example, can look more natural than in Smalltalk

   if...then please do[...].

The "receiver" doesn't have to be the very first thing in a message. This is also perfectly valid

   sin45.

The name of a message, its selector, is very flexible and so we will indicate it as the sequence of characters in the message without any white space characters and with the underlined expressions replaced with the period character (the only one that can't be part of a selector). So the selectors of the messages we have seen so far are .factorial , .+. , .between:.and:. , .between:.And:. , if.thenpleasedo. and sin.

We could also have .sin in the same system and there would be no danger of confusing the postfix and prefix versions of the selector.

Rule 2: An empty underlined expression represents the currently executing context. When it would be the first character in a message it can be omitted

That is like the implicit receiver in Self. Combined with the flexibility of rule 1, however, the possibility of ambiguity does arise. The message .help. with an omitted first argument (receiver) would look exactly like the message help. and if both are defined it is not possible to know which one was intended. To handle this, the first interpretation is always chosen so it is not a good idea to define a selector like the prefix "help". Of course, color (or boldface) could be used to separate the two cases.

Object Description

Self objects are made up of slots, each with a name and a value. The object description syntax allows us to write a text from which the object can be recreated. Of course, we could simply write a long sequence of message sends that would explicitly build the object as long as there is at least one pre-existing object to send messages to and it understands a set of low level object manipulation messages. Rather than do things this way, Self adapted the syntax that Smalltalk uses for block objects to describe objects in general. This is very convenient, but rather complex. And with parenthesis delimiting regular objects and methods (square brackets for blocks, as in Smalltalk) programs can indeed look very much like LISP.

But while Self 1 through 3 would load code created with external text editors, Self 4 included a full graphical development environment. Objects can now be created with the mouse and the object description syntax is no longer needed. It is still used, mostly at the "text prompt", as a shortcut for creating very simple (normally empty) objects. And the slot syntax is still needed because GUI editing is incomplete.

Neo Smalltalk does not have an object description syntax. In fact, as the next section shows, it doesn't have a literal syntax of any kind. You must either use graphical tools to build them or be prepared to send a very long and boring set of messages to some mirror object.

Other Literals

Self has a syntax for describing generic objects, methods and blocks. It also has more compact representations for several kinds of numbers and a C-like syntax for strings. Smalltalk also has literal characters and symbols (both just strings in Self) and Arrays. A few dialects add some more literals, with SmallScript having the richest set.

Neo Smalltalk does not include syntax for any kind of literal object at all. But it does allow any object to be directly inserted into the middle of source code. And the editor includes several shortcuts to greatly simplify getting at the most common kinds of objects. So if you are typing along and press the ' key, it will insert an empty string object at that place in the source and move the focus into that object's editor so that the next key you hit will insert a character into the string. The string's editor might be set up (or not) so that pressing the ' key within it saves the updated string and returns the focus to the main editor. Same thing for the [ key and blocks, the 0 to 9 keys and numbers and so on. Each programmer would configure other keys to have easy access to the kinds of objects they use most - F3 might insert a vector and F5 a bitmap, for example.

So the several examples shown above for Self would be typed the same way in Neo Smalltalk but would look slightly different since it would be clear that the string and the numbers are actual, live objects and not text.

[Has anyone actually tried programmming like this? It sounds ok on the surface, but the reality might be awful. I would plan on some kind of texual arternative, maybe just for archiving - David Harris]

ASCII representation

Since parentheses aren't really needed with this syntax, we can use (this to represent underlined (and doubly underlined) text) when saving to a text file. Though it looks too busy, it is readable enough to get the job done.

The lack of literals is a far harder problem. We can introduce a notation to assign a name to a given expression and another to refer to that same object in a later expression. Something like

   $seven((3)+(4)).
   (2)($seven)

Obviously the "3", "4" and "2" would actually be defined using message sends just like $seven was, but showing that would make the example too large.