Commonsense Computing @ Media

Home | Media Lab | MIT

The Roboverse Domain

Push Singh, Bo Morgan, and Radu Raduta

In order to separate the study of architectures for commonsense thinking from the need for immense commonsense knowledge bases, we have built a virtual physical world in which several simulated people work together to solve problems, for example, to build a table. Each person engages in commonsense reasoning within the spatial, physical, and social realms to decide what actions to take. The user can interact with these robots via a natural language speech interface to request that they adopt specific goals or take particular actions.

Pink: I see you are building a tower.

Blue: Yes, but I cannot reach that blue block.

Pink: I can reach it. Let me get it for you.

While this domain may seem sparse, its simplicity hides a great depth of issues. In particular, the mental realms we have discussed so far all show up in some form in this domain. Because the world is physically realistic, the people must reason about the effects of gravity on objects and the forces that must be applied to move them. Because the people have synthetic vision systems, they must reason about whether objects that seem to have disappeared behind bigger ones are in fact really still there. Because there are several people, they must reason about the social challenges that arise between them, such as conflicts between their goals and possible opportunities for cooperation. To solve problems in this world requires reasoning simultaneously about the physical, social, psychological, and several other mental realms.

Aaron Sloman proposed this example of a sequence of increasingly sophisticated problems in this domain:

  1. Person wants to get box from high shelf. Ladder is in place. Person climbs ladder, picks up box, and climbs down.
  2. As for 1, except that the person climbs ladder, finds he can't reach the box because it's too far to one side, so he climbs down, moves the ladder sideways, then as 1.
  3. As for 1, except that the ladder is lying on the floor at the far end of the room. He drags it across the room lifts it against the wall, then as 1.
  4. As for 1, except that if asked while climbing the ladder why he is climbing it the person answers: something like "To get the box." It should understand why "To get to the top of the ladder" or "To increase my height above the floor" would be inappropriate, albeit correct.
  5. As for 2 and 3, except that when asked, "Why are you moving the ladder?" the person gives a sensible reply. This can depend in complex ways on the previous contexts, as when there is already a ladder closer to the box, but which looks unsafe or has just been painted. If asked, "would it be safe to climb if the foot of the ladder is right up against the wall?" the person can reply with an answer that shows an understanding of the physics and geometry of the situation.
  6. The ladder is not long enough to reach the shelf if put against the wall at a safe angle for climbing. Another person suggests moving the bottom closer to the wall, and offers to hold the bottom of the ladder to make it safe. If asked why holding it will make it safe, gives a sensible answer about preventing rotation of ladder.
  7. There is no ladder, but there are wooden rungs, and rails with holes from which a ladder can be constructed. The person makes a ladder and then acts as in previous scenarios. (This needs further unpacking, e.g. regarding sensible sequences of actions, things that can go wrong during the construction, and how to recover from them, etc.)
  8. As for 7, but the rungs fit only loosely into the holes in the rails. Person assembles the ladder but refuses to climb up it, and if asked why can explain why it is unsafe.
  9. Person watching another who is about to climb up the ladder with loose rungs should be able to explain that a calamity could result, that the other might be hurt, and that people don't like being hurt.

We are developing our commonsense reasoning systems by making them face a substantial library of such graded sequences of mini-scenarios that require them both to learn new skills, to improve their abilities to reflect on them, and (with practice) to become much more fluent and quick at achieving these tasks.

For more discussion about this approach please visit Aaron Sloman's web page Metrics and Targets for a Grand Challenge Project Aiming to produce a child-like robot.

2004 MIT Media Lab