Collections (Ctalk Tutorial)

Collections are groups of objects. Ctalk has the following main classes of collections.

These classes are subclasses of Collection class. Here is the section of the class library that shows the organization of Collection and its subclasses.

 Collection
  Array
  List
   AssociativeArray
  Stream
   FileStream
    DirectoryStream
    ReadFileStream
    WriteFileStream
   TerminalStream
    ANSITerminalStream
    Win32TerminalStream
    X11TerminalStream
  TreeNode

This chapter provides a description of the basic collection classes, AssociativeArray, List, and TreeNode.

The previous chapters described Array and Stream subclasses (the classes FileStream, ReadFileStream, and WriteFileStream).

The later sections that discuss graphics also discuss the TerminalStream subclasses in the context of providing user input, because these classes are intended to be used for input from devices like keyboards and mice.

List Class

List new l;
Integer new i;
Integer new i2;

l push i;
l push i2;

List new l;
Integer new i;
Integer new i2;
Integer new i3;
Integer new i4;

l push i;
l push i2;

i3 = l pop;
i4 = l pop;

The methods shift and unshift are similar to push and pop, respectively, but they add and remove items from the front of the list.

Here is a simple, if slightly unwieldy, example of how these methods work. You can find this program in test/basiclist.c.

int main () {

  List new l;
  Integer new i1;
  Integer new i2;
  Integer new i3;
  Integer new i4;
  Integer new i5;
  Integer new i6;

  i1 = 1;
  printf ("%d ", i1);
  i2 = 2;
  printf ("%d ", i2);
  i3 = 3;
  printf ("%d ", i3);
  printf ("\n");

  l push i1;
  l push i2;
  l push i3;

  i4 = l pop;
  printf ("%d ", i4);
  i5 = l pop;
  printf ("%d ", i5);
  i6 = l pop;
  printf ("%d ", i6);
  printf ("\n");

  l shift i1;
  printf ("%d ", i1);
  l shift i2;
  printf ("%d ", i2);
  l shift i3;
  printf ("%d ", i3);
  printf ("\n");

  i4 = l unshift;
  printf ("%d ", i4);
  i5 = l unshift;
  printf ("%d ", i5);
  i6 = l unshift;
  printf ("%d ", i6);
  printf ("\n");

  return 0;
}

Generally, however, you want to be able to work with Lists of any length. The basic method for working sequentially with every item in a list is the method map.

List instanceMethod printItem (void) {
  Integer new element;
  element = self;
  printf ("%d ", element);
  return NULL;
}

int main () {

  List new l;
  Integer new i1;
  Integer new i2;
  Integer new i3;

  i1 = 1;
  i2 = 2;
  i3 = 3;

  l push i1;
  l push i2;
  l push i3;

  l map printItem;
  printf ("\n");
}

The argument to map, in this example printItem, must be an instance method of class List, even though the receiver of each call to printItem, which is each successive member of List l (in main()), can have its own class.

This example uses Integers as list items, so we can assume that self in the printItem method is always going to be an Integer. But programs cannot always make assumptions about what class of items a List stores. See self Class Resolution, for a detailed discussion.

Some classes, including List, allow you to provide an additional argument to the target method. In that case, you must use that argument as the second argument to map, after the name of the method that’s going to do work on each List item. These sorts of expressions are easier to write than to describe. So here’s a revised version of the example above which should make this clearer.

                              /* Here, "leftMargin" is passed as    */
                              /* the second argument to map, below. */
List instanceMethod printItem (String leftMargin) {
  Integer new element;
  element = self;
  printf ("%s%d ", leftMargin, element);
  return NULL;
}

int main () {

  List new l;
  Integer new i1;
  Integer new i2;
  Integer new i3;
  String new leftMargin;

  leftMargin = "  ";

  i1 = 1;
  i2 = 2;
  i3 = 3;

  l push i1;
  l push i2;
  l push i3;

  l map printItem, leftMargin;  /* "leftMargin" is the first argument when */
                               /* printItem is called.                    */
  printf ("\n");
}

This syntax works only when the argument to map is a separate method, not with argument blocks. See ArgumentBlocks.

List class actually allows you to pass two additional arguments to the method that the list maps over. Extending the number of arguments is beyond the scope of this tutorial, but Ctalk internally can support a range of conventions when mapping over List and other Collection objects.

Initializing Lists

When initializing Lists, you can also use the = method to add items to the List, which you may find more convenient. You could replace the push methods in the example above with this expression.

If you also want to skip the assignment statements for i1, i2, and i3, the example above could also be written like this.

There’s more information about initializing Lists and other types of collections later in the Tutorial. See InitializingCollections.

To remove all of the items from a List and leave the list as it was when it was first created, you can simply use the delete method.

Otherwise, a method like pop removes the object and returns it, leaving it up to the program to either delete the object or use it elsewhere. So if you want to remove (and in this case delete) individual objects from a list, you can use an expression like this.


while ((item = myList pop) != NULL)
  item delete;


While (item = myList pop)
      item delete;


while (myList pop)
 ;

Then the program does nothing with the objects that were formerly stored in myList, and the expression causes memory leaks, which is especially noticeable if the program replaces the objects in the myList one or more times.

Creating Objects on the Fly

It’s often tedious or impractical to declare each List member separately. For cases like that, Ctalk has a method called basicNew, which creates objects on the fly, as in this hypothetical example.


List new myList;
Integer new listMember;
Integer new i;
String new nameStr;
String new valueStr;

for (i = 0; i < 10; i = i + 1) {

    nameStr = "List Member " + i asString;
    valueStr = i asString;   

    listMember = Integer new nameStr, valueStr;

    myList push listMember;

}

The basicNew method has several flavors, which are described in the Ctalk Reference Manual.

AssociativeArray Class

Programs use the name of the Key object, normally a String, to store and retrieve elements of the array. The value of the Key object is the object that you want to store in the array.

The AssociativeArray class does the work of assigning keys and storing objects in the arrays for you, using the methods at and atPut.

myAssociativeArray atPut "keyName", myObject;

myObjectFromBefore = myAssociativeArray at "keyName";

Initializing AssociativeArrays

AssociativeArray is another class, like List, that allows you to set multiple elements with one expression. The methods, =, and +=, init and append can take a variable number of keys and values as their arguments.

Ctalk interprets the argument list to contain any number of key,value pairs following either the method =, or init. The two are interchangeable. Here is an example.


myAssocArray init "key1", "value1", "key2", "value2", "key3", "value3";

... or ...

myAssocArray = "key1", "value1", "key2", "value2", "key3", "value3";

The += and append methods add the key,value pairs given as the arguments to the end of the reciever AssociativeArray, as in this example.


myAssocArray init "key1", "first", "key2", "second", "key3", "third", 
  "key4", "fourth";
myAssocArray append "key5", "fifth", "key6", "sixth", "key7", "seventh", 
  "key8", "eigth";

... or ...

myAssocArray = "key1", "first", "key2", "second", "key3", "third", 
  "key4", "fourth";
myAssocArray += "key5", "fifth", "key6", "sixth", "key7", "seventh", 
  "key8", "eigth";

You can also treat AssociativeArray elements sequentially, using the methods map and mapKeys. The Ctalk Language Reference contains examples of their use.

If you want to remove an object from an AssociativeArray or another subclass of Collection, use the removeAt method. The removeAt method returns the object that was stored in the collection.

The Ctalk Language Reference describes the details of storing and retrieving Key object names and values.

TreeNode Class

A TreeNode object is a component of a tree data structure. Each TreeNode object can have sibling and child objects. The TreeNode class provides basic methods to add objects to a tree, set each TreeNode's content, and to traverse the tree.

The example program shows two different ways to construct trees. It actually builds two trees, which are identified by their head nodes, head and head2.

TreeNode instanceMethod printNode (void) {
  printf ("%s\n", self content);
}

int main () {

  TreeNode new head;
  TreeNode new head2;
  TreeNode new sibling;
  String new content;
  Symbol new sib;
  Symbol new child;
  TreeNode new tSib;
  TreeNode new tChild;
  TreeNode new tChildChild;
  Integer new i;

  content = "Head Node";

  head setContent content;

  head2 setContent content;

  printf ("%s\n", head content);

  for (i = 1; i <= 10; i++) {
    content = "1. Sibling Node " + (i asString);

    *sib = TreeNode basicNew content, content;
    *sib setContent content;
    head makeSibling *sib;

    content = "Child Node";
    tChild = TreeNode basicNew content, content;
    tChild setContent content;
    (*sib) addChild tChild;

    content = "2. Sibling Node " + (i asString);

    tSib = tSib basicNew content, "TreeNode", "Symbol", content;
    tSib setContent content;
    head2 makeSibling tSib;

    content = "Child Node";
    tChild = TreeNode basicNew content;
    tChild setContent content;
    tSib addChild tChild;

  }

  head2 siblings map {
    content = "Child Node 2";
    tChild = TreeNode basicNew content, content;
    tChild setContent content;
    tSib = TreeNode basicNew "Sibling of Child";
    tSib setContent "Sibling of Child";
    tChild makeSibling tSib;
    (TreeNode *)self addChild tChild;

    content = "Child of Child";
    tChildChild = TreeNode basicNew content, content;
    tChildChild setContent content;
    tChild addChild tChildChild;

  }

  printf ("--------------------\n");

  head2 map printNode;
}

The program uses two different methods to add nodes to a tree, both contained within the for loop.


    /* Build the first tree by using Symbol objects to refer to the
       tree's nodes. */

    content = "1. Sibling Node " + (i asString);  

    *sib = TreeNode basicNew content, content;
    *sib setContent content;
    head makeSibling *sib;

    content = "Child Node";
    tChild = TreeNode basicNew content, content;
    tChild setContent content;
    (*sib) addChild tChild;


    /* Programs can also build trees by using TreeNode objects directly,
       though for complex programs this method may be less convenient. */

    content = "2. Sibling Node " + (i asString);

    tSib = tSib basicNew content, "TreeNode", "Symbol", content;
    tSib setContent content;
    head2 makeSibling tSib;

    content = "Child Node";
    tChild = TreeNode basicNew content;
    tChild setContent content;
    tSib addChild tChild;

Referring to TreeNode objects directly and by Symbol references produces similar results here. Using Symbol references, however, might be more flexible in the case of complex programs.

In each case, adding nodes to the tree uses the same four methods: basicNew, setContent, makeSibling, and addChild.

When run, the program produces output similar to this (somewhat shortened) example.

Head Node
--------------------
2. Sibling Node 1
Child Node
Child Node 2
Child of Child
Sibling of Child
2. Sibling Node 2
Child Node
Child Node 2
Child of Child
Sibling of Child

...

2. Sibling Node 10
Child Node
Child Node 2
Child of Child
Sibling of Child

If you want each level of the tree indented, you can use either the print method to print the tree to the screen, or format to format the tree to a String object. The class provides the levelMargin and levelMarginLength instance variables, which allow programs to specify the indentation of each level.

Collection Elements as Receivers

When using collection elements, programs often don’t know until run time exactly what the class of a receiver might be - elements of object. In addition methods like map, use each element of a collection as a receiver.

Collections all use Key objects to store the actual data. In methods that use collection elements, programs can either declare a Key object explicitly, or it can rely on a method in a superclass to determine whether the receivers are valid. The Ctalk Web Utilities manual provides examples of both techniques.

Here is getValue in Collection class, which determines that the actual receiver of a getValue message at run time is a Key object and generates an exception otherwise.

Collection instanceMethod getValue (void) {
  Exception new e;

  if (self is Key) {
    return self getValue;
  }
  e raiseCriticalException INVALID_RECEIVER_X, 
    "Message \"getValue\" sent to a Collection object, not a Key object.";
  return NULL;
}

Ctalk has several ways to cope with ad-hoc classes, which this tutorial describes further on. See self Class Resolution.

Using Math Operators with Collections (Still More about Looping)

Earlier this tutorial discussed how overloaded math operators work with different classes.

Once again, Ctalk overloads the math operators +, -, ++, --, +=, -=, and * operators so they can work with collection elements.

More specifically, the operators work with Key objects, so this section describes the Collection subclasses that use Key objects to maintain their elements - the classes List, Array, AssociativeArray, and any of their subclasses that a program implements.

The methods also work with elements of TreeNode objects, with a few extra steps, and they are discussed below.


int main (int argc, char **argv) {

  List new a;
  Key new k;
  Key new j;
  Integer new i;

  a push "value1";
  a push "value2";
  a push "value3";
  a push "value4";

  k = *a;
  i = 0;

  while (++k) {
    printf ("%s\n",*k);
    if (i == 1) {
        j = k;
        printf ("j = k: %s\n", *k);
    }
    ++i;
  }

  printf ("-------------------\n");
  printf ("%s\n", *j);

}


value2
value3
j = k: value3
value4
-------------------
value3

You should note that the elements get printed starting with value2. The reason for this is discussed in the next section.

sets the Key object k to the first element of a, which is a List. When the operator * is used with a Collection, it refers to the first element of the collection. When * is used with a the Key object itself, though, it refers to the contents of list element.


printf ("%s\n", **a);


  while (++k) {
    printf ("%s\n",*k);
    if (i == 1) {
        j = k;
        printf ("j = k: %s\n", *k);
    }
    ++i;
  }

sets k to each successive member of the list, then prints its value (using the expression “*k”), and, if k refers to the second element of the list, sets j to it, so the program can refer to the list element later.

When k reaches the end of the list, it is set to NULL, so the loop terminates and the program proceeds with the instructions further down.

The main limitation of these operators is that they operate only on objects that a program declares with new (and basicNew in most cases). That means the operators don’t have any effect on instance and class variables.

It is often possible, though, to refer to an object’s instance variable by another object, which the math operators can work with - so, for example, to loop through a TreeNode object, you might use an expression like this one.


TreeNode new tree;
Key new s;

... Add nodes to the tree. ...

s = *tree siblings;

while (s++) {

... do stuff ...

}

Ctalk’s operator precedence evaluates method and instance variable labels before math operators, so the expressions “*tree siblings” and “*(tree siblings)” are equivalent.

Some Things to Look Out For with Math Operators

There are a few cautions that you need to observe when overloading math operators, however. One of the main cautions is trying to assign a Key to an empty collection, which is almost guaranteed to cause a program to crash. So if you’re in doubt about the contents of the input you can check that it isn’t empty, or you can check the size of the tokenized output, as in this code.


List new tokenList;

aString tokenize tokenList;

if (tokenList size > 0) {   Make sure the list isn't empty...

... then do stuff ...

}

Another very critical issue is trying to use math with self, if the program maps over a collection. That’s because self is already the value of each successive item in the collection, and it’s not always guaranteed that self is going to be a class that you can apply pointer math to.


aString tokenize tokenList;

tokenList map {

  aTok = self + 1;   Wrong... self is not a Key object.

}

If you must use math within an argument block, you can try storing the tokens in an AssociativeArray, and then using mapKeys to iterate over it.


aString tokenize assocArray;

assocArray mapKeys {

  aTok = self + 1;

}

But this relies on what self actually refers to within the code block, which you need to watch carefully. And changing the value of self is almost certain to cause the loop to behave unpredictably.

Also, programs need to take care when trying to cast self to particular class.


aString tokenize aList;

aList map {

  aTok = (Key *)self + 1;

}

This example only works if aList is, for example, a list of Key objects, which have the semantics to do what you expect the math operators to do.

Another caveat is using ++ or -- in a while loop. This can cause the loop to miss members of the collection.


tok = *aList;

while (tok++) {          /* tok gets iterated after it retrieves the
                             value... */

  printf ("%s\n", *tok); /*  so here, tok points to the next member
                             of the collection.

}

And here, if you use a prefix operator, the loop still misses members of the collection.


while (++tok) {

  printf ("%s\n", *tok);  

}

It’s usually more correct to use a do... while loop to iterate over a collection, as in this example.


tok = *aList;

do {

  printf ("%s\n", *tok);

} while (++tok);


tok = *aList;

while (tok) {

  printf ("%s\n", *tok);

  ++tok;
}

Another place to exercise caution is when using complex expressions in-line, for example, as an argument to printf ().


  // Trim leading spaces. 
  if (str matchRegex "^ *\"", offsets > 0)
    printf ("%s\n", str + str matchLength);   // Produces the wrong result.

Ctalk can decide to evaluate the argument separately, which can cause the receiver to be assigned to the result before the entire expression is evaluated. If the operator seems to have no effect, you can try using eval to tell Ctalk to evaluate the entire expression at once, or simplifying the terms of the expression.


  if (str matchRegex "^ *\"", offsets > 0)
    printf ("%s\n", eval str + str matchLength);

  ... or ...

  if (str matchRegex "^ *\"", offsets > 0) {
    matchStart = str matchLength;
    printf ("%s\n", str + matchStart);
  }

Another place to be careful is when you try to assign a collection element to a Key directly.


  *destListPtr = *globalList;

  for (...) {
     ... do stuff...
     (*destListPtr) = (*(globalList + myInt));
  }

This has the effect of changing the object that destListPtr refers to; basically it reshuffles the members of globalList, so you need to be careful if an expression contains a ‘*’ on the left-hand side of an assignment. If you only want to iterate destListPtr over the list, use something like this.


  destListPtr = *globalList;

  for (...) {
     ... do stuff...
     ++destListPtr;
  }

Here’s example of how dereferencing works when using a numerical offset to retrieve a List element.


Object class ListWrapper;

ListWrapper instanceVariable items List NULL;

int n = 1;


ListWrapper instanceMethod prOffset (void) {
  String new item;
  
  self items map {
    item = *(super items + n);
    printf ("%s\n", item);
  }
}

int main () {
  ListWrapper new lw;

  lw items push "Item 0";
  lw items push "Item 1";
  lw items push "Item 2";
  lw items push "Item 3";
  lw items push "Item 4";

  lw prOffset;
}

The main action in this case occurs in the prOffset method. Here, we want to retrieve only the n’th element in the receiver list. We’ve placed it inside an argument block, so the keyword super refers to prOffset's receiver while the program is executing in the argument block’s scope.


item = *(super items + n);

Here, item is a String object. When we want to set its value, we need to get the list element’s content.

is a Key object. That’s why the ‘*’ dereference operator is there - it retrieves the object that the nth position in the list refers to.

Internally, the organization of a Collection object, or an instance of one of its subclasses (like List in this example) is this: a sequence of keys that comprise instance variables in addition to the value instance variable (internally, it’s a C list also). This is what an expression like super items + n would retrieve: the n’th key (counting from zero) from the entire set of the collection’s keys.

However, each key doesn’t itself contain the collection’s contents; each of the keys contains a reference another object, which may be declared anywhere else in the program. From the perspective of a Ctalk expression, all of the target objects together comprise the contents of the collection.

When you use a statement like *(super items + n) in a situation like this, it instructs the class to return the object that the n’th key refers to.


                               (super items + n)
                              n = 0         n = 1          ... etc.
 -------       -------       -------       -------
|       |     | value |     |       |     |       |
| aList | --> | inst. | --> |  Key  | --> |  Key  | -->    ...
|       |     |  var. |     |       |     |       |
 -------       -------       -------       -------

                                |             |
                                |             |

                              *(super items + n)                                
                             -------       -------
                            |       |     |       |
                            |Content|     |Content|
                            |       |     |       |
                             -------       -------


ListWrapper instanceMethod prOffset (void) {
  Key new item;
  String new content;
  
  self items map {
    item = super items + l_offset;
    content = *item;
    printf ("%s\n", content);
  }
}

The main difference is where the dereference operator, ‘*’, occurs within the method.

And here is yet a third way that we can retrieve the content of the n’th list element.


ListWrapper instanceMethod prOffset (void) {
  Key new item;
  
  self items map {
    item = super items + n;
    printf ("%s\n", *item);
  }
}

Here, the retrieval of the list’s String element occurs when the printf statement is executed.

It’s probably worth mentioning in passing that the actual names of the Key objects in a List aren’t significant. The methods in List class treat each key in the collection by referring to the object that comprises the list’s contents in succession (generally, this is referred to as each element’s value), whether the key/value item is at the beginning or end of the list, or by the key/value’s position in the list.

However, collections like AssociativeArray and Array do use key names that are significant. An example is the AssociativeArray classes’ method, mapKeys, that doesn’t directly retrieve the collection’s values; that is, in an argument block that uses mapKeys, self refers to a Key object, in addition to the more common map method. There are examples of this and other expressions that use collections in the Ctalk Language Reference.

Class Casting

Ctalk allows you to cast an object to any class you like. This allows programs to take advantage of the semantics that one class provides in case they’re needed for some other class, among having other advantages.

For example, when iterating over collections, a cast can be useful when you want to work with the actual collection element, instead of a copy. Look at this hypothetical example


List new myListOfInts;
Integer new listElement;
Key new listKey;

listKey = *myListOfInts;    

while (listKey) {

  (Object *)listElement = *listKey;  /* This causes the program to
                                        use Object : =, which assigns
                                        by reference, instead of
                                        Integer : = which assigns by
                                        value, so that listElement
                                        is the actual list item, not
                                        just a copy of its value. */

  ... do stuff with listElement...                                    

  ++listKey;
}

Of course, it’s up to you to insure that the object is valid for the class you’re casting it to. Class casts, however, work equally well with objects and OBJECT * C variables.

To find out more about class casting, refer the the section titled, Class casting in the Ctalk Language Reference.

Initializing Collections

Most of the examples in this tutorial use a series of push or atPut methods to initialize collections. However, several of the subclasses of Collection class (namely, List and AssociativeArray classes) also overload the = and += math operators so you can initialize a new collection or add members to an existing collection with one expression.

The following example initializes myList with a group of String objects using one statement.


myList = "string1", "string2", "string3", "string4";

This statement initializes myList to contain exactly the four String objects given as arguments. If the program added any members to myList before this statement, they are first removed from the list.

To add members to myList without affecting the current contents of the list, use the method, +=.


myList = object1, object2, object3, object4;

myList += object5, object6, object7, object8;

Here is the example from the section discussing Lists, slightly abbreviated. See ListClass.


                              /* Here again, "leftMargin" is passed as */
                              /* the second argument to map, below. */
List instanceMethod printItem (String leftMargin) {
  Integer new element;
  element = self;
  printf ("%s%d ", leftMargin, element);
  return NULL;
}

int main () {

  List new l;
  String new leftMargin;

  leftMargin = "  ";

  l = 1, 2, 3;

  l map printItem, leftMargin;  /* "leftMargin" is the first argument when */
                               /* printItem is called.                    */
  printf ("\n");
}

AssociativeArray class also overloads = and +=, but the methods interpret the argument list as a set of key, value pairs.


myArray = key1, value1, key2, value2, key3, value3, ... ;

The keyN arguments may be either a String constant or simply a label– Ctalk interprets them as the key names regardless of their class when it creates the receiver’s keys. The valueN argument can be whatever the receiver AssociativeArray needs to contain.

Here’s a brief example program that initializes and then prints the contents of an AssociativeArray.


int main () {
  AssociativeArray new a;

  a = "key1", "first", "key2", "second", "key3", "third", "key4", "fourth";
  a += "key5", "fifth", "key6", "sixth", "key7", "seventh", "key8", "eighth";

  a mapKeys {
    printf ("%s --> %s\n", self name, *self);
  }
}

As the tutorial mentioned above, the keyN objects do not need to be String contsants, they can simply be objects that are created when the argument is parsed, and can be used to name the Key objects in the receiver collection.

That means that statements like these would work equally well to initialize the AssociativeArray in the previous example.

Collections

Collections

`List` Class

Initializing Lists

Creating Objects on the Fly

`AssociativeArray` Class

Initializing AssociativeArrays

`TreeNode` Class

Collection Elements as Receivers

Using Math Operators with Collections (Still More about Looping)

Some Things to Look Out For with Math Operators

Class Casting

Initializing Collections

Collections

Collections

List Class

Initializing Lists

Creating Objects on the Fly

AssociativeArray Class

Initializing AssociativeArrays

TreeNode Class

Collection Elements as Receivers

Using Math Operators with Collections (Still More about Looping)

Some Things to Look Out For with Math Operators

Class Casting

Initializing Collections

`List` Class

`AssociativeArray` Class

`TreeNode` Class