5.7.1 Inner language and memory compression based on the demo of imitation
Using a static imitation goal as in the demo, we can calculate rough
compression / decompression estimates.
For the perception component, translating video images to inner language descriptions:
- Pixellated image to vector graphics represents a compression of almost 100 to 1
- There are various image standards, both for pixel-based and for vectors. This rough estimate is based on the image of the instructor shown in the demo. We started with a TIFF and converted (traced) this into SVG and compared file sizes in bytes.
- Vector graphics to language represents a further compression of about 100 to 1
- There clear representation for the hypothesized "internal" language. Instead we used a simple description in English. We started with SVG file above and compared it to the byte count for the text file containing the description in English.
For the action component, translating commands to muscle tension specifications:
- Decompression from language to joint angles
- For lack of anything better we are using a modified representation of Labanotation that is used in ballet for capturing choreography. A very rough estimate would be in the oder of 1 to 10 for the expansion of the English description into Labanotation.
- Decompression from joint angles to muscle tension
- Since this information flow is millisecond by millisecond, timing and duration of the movement is very important.
- From our earlier estimates on page 1.2.4 we have a rough ratio of 160 to 1 in visual information compared to action information.
- Based on this earlier estimate, the joint angle to muscle tension decompression should be around 1 to 6, which would correspond to roughly 6 muscles per joint, which corresponds to our earlier estimate of 3 degrees of freedom with 2 balancing muscles for each.
For both perception and action we have a problem of perspective and directionality which we will discuss on the next page.