Friday, 1 December 2017

Before Wildwood: the Arboretum engine

(this piece was written as an email in a response to a question by Chas Emerick about whether it is possible to explain inference to ordinary people. It describes the Arboretum engine, which started me thinking about how you make a computable logic which people can understand.)

Right, let's try to set this out in some sensible detail. I'll start by giving you the bones of the story, and then move on to talk about the things you're actually interested in. I'm copying Peter Mott in so he can correct any misrememberings of mine (I have long term severe depression, and it really messes with your memory).

I did a minor course in logic and metaphysics when I was an undergrad. When I graduated in 1986, my logic tutor, Peter Mott, who was entirely deaf, was invited to join an AI project - the "Alvey DHSS Large Demonstrator" essentially as the formal logic input to the process. He invited me to join him as his research associate, mainly I think because having been in his seminars for two years I was pretty fluent at discussing technicalities of logic in sign language, and could interpret for him. I was never the world's most brilliant logician, but I could more or less do it.

Peter was certainly responsible for the logic of the "D-Engine", and I think wrote the first prototype in Acornsoft Lisp himself. The D-Engine logic is a Popperian propositional logic, and the goal of the engine at any stage is to seek to falsify what it currently believes. Again I think it was Peter's insight that this would map naturally onto a systematic practice for interviewing domain experts, but, being deaf, he couldn't develop this interviewing practice himself.

A rule in D-Engine logic is naturally a tree, which we called a "D-Tree". It nodes are propositions "coloured" with truth values, and edges essentially represent "unless" relationships. Because D-Engine logic naturally argues from the general to the particular, it's possible to recover extremely good natural language explanations of decisions from simple boilerplate fragments on the nodes.

There's a paper explaining all this: Mott, P. & Brooke, S. A Graphical Inference Mechanism. Expert Systems, May 1987. vol. 4. No. 2 pp. 106-117

(Golly. I've just - for the first time ever - done a google search for that paper, to see it it was online anywhere. It isn't, but it's cited in lots of places - including by Marc Eisenstadt! I'm pleasantly suprised).

I built the second prototype, Arboretum, in Interlisp on Xerox 1108 Dandelion and 1186 Daybreak machines, and that was the one that was used in the project; we demonstrated that it was possible to automate decision making on a piece of legislation (eligibility for widows' allowance), and, critically, to automatically generate letters to claimants which claimants could understand.

At the end of the project, Peter and I and a number of others of Peter's ex-students spun out a company to try to commercialise D-Engine logic. As (again) Peter was deaf, and as he intended to continue as a lecturer, he chose not to head up the company, and I was elected Chief Exec; I was also the technical lead. We knew we couldn't commercialise our work on the Xerox workstations, because they were way too specialist and expensive to sell into industry. I had a look at Windows, which was at version 2 at that stage, and it was just a horrible mess. We couldn't afford Macintosh (and, in any case, the M68000 Macs of those days were really not very powerful); so we decided to write the new version on Acorn Archimedes ARM machines, on the basis that I already had one. The plan was always to port it to UN*X later.

There was a good Lisp compiler for the Archimedes machines, but it was Portable Standard Lisp which was going out of favour (Common Lisp had just been standardised). It suffered from another problem which was for us more serious: the garbage collector moved things in memory, and the Archimedes window manager held fixed pointers to things; so, while I did a lot of work on this and got somewhere, the solution wasn't going to be portable.

So we decided to rewrite the system in C (which at that stage none of us knew). We also decided to do something else ambitious: we decided that because our new version - called KnacqTools for "Knowledge Acquisition Tools" - would require a graphics workstation, and such machines were then uncommon in industry and expensive, we should compile the knowledge once elicited down to run on one of a number of rule engines that were then available for PCs; and we decided that these rule language compilers should be pluggable.

We got the Archimedes version of KnacqTools feature complete and working well within a year. We demonstrated it to Acorn, who gave us two of the UN*X (BSD 4.3) version of their ARM workstations, called R260. On these we ported KnacqTools to OSF-Motif, which was the then-dominant window manager for X11. We got this feature complete in not very long, and it looked extremely impressive - so impressive that NCR gave us UN*X workstations with the intention that they would adopt KnacqTools as a product. However, the OSF-Motif version suffered from persistent memory leaks which we never fully resolved.

But in the meantime the company was in financial trouble. We'd started without capital and had funded the company organically by doing consulting. Members of the team wanted me to concentrate on product development, but I was also the person who could do the sort of consulting work which brought in money. And in the winter of 1981-2, recession hit the UK economy and our customers stopped spending on experimental AI projects.

If I knew then what I know now, the company could have been saved. What we needed was an angel, and we actually had a very good story to tell to angels. Our customers included Courtaulds - for whom one of our systems ran a critical chemical plant - Ford Motor Company, Bull, De La Rue, and a number of other industrial big players. We had a very impressive product, which was very close to finished. And we were teaching successful courses on the adoption of artificial intelligence into industry. But I didn't know about angels and I didn't have the contacts, so the company went down.

OK, that's the background: onwards!

Over a period of about three years I taught three-day courses on knowledge acquisition to around a hundred people mainly drawn from industry and the civil service - I can't remember why we didn't do more with financial services, which would have been an obvious market, but it's a fact we didn't. In teaching knowledge acquisition I obviously also taught an understanding of representing and structuring knowledge. The people I taught it to were mostly degree educated, and many were extremely senior people - ranging from senior engineers in charge of industrial plant, to executives evaluating future technologies for their companies. But they weren't by any means universally people with advanced maths skills or any previous experience of formal logic. These courses were generally very successful, and at the end of them participants typically could build working systems to make decisions in a domain of knowledge.

What made this possible was the very simple logical schema: ask the domain expert to choose the question to be answered, the proposition whose truth value was to be determined:

This person satisfies the conditions for widows' benefit

Ask them what the default value of the proposition is - in this case, 'false', most people aren't eligible for widows' benefit.

We would notate this as follows:

        Satisfies conditions for Widows' Benefit -

So OK, you ask the expert, what's the first thing you can think of which would make you change your mind, make you think a given person was eligible?

If they were a widow.

And if they were a widow, would they then be eligible?

No, their husband's national insurance contributions would have to have been up to date

And if their husband's national insurance contributions were up to date, would they then be eligible?

No.

We can now extend the graph:

        Satisfies conditions for Widows' Benefit -
                            |
                            |
                         Widow -
                            |
                            |
               Husband's contributions OK -

So, given they are a widow, and their husbands contributions were up to date, what's the first thing you can think of which would make you change your mind?

If they were under pension age when bereaved.

And if they were under pension age when bereaved, would they then be eligible?

Yes.

We now have a condition which overturns our original assumption that the person isn't eligible, and we notate it thus:

        Satisfies conditions for Widows' Benefit -
                            |
                            |
                         Widow -
                            |
                            |
               Husband's contributions OK -
                            |
                            |
                   Under pension age +

Is there any further condition that would change your mind and lead you to conclude that they were not eligible?

No.

So that closes one path and we recurse back up:

OK, so we've got someone who is a widow and whose husband's contributions were OK, but was not under pension age when bereaved. Is there anything else that would make you think they satisfied the conditions for widows benefit?

Yes, if their husband wasn't entitled to a retirement pension they would be eligible.

        Satisfies conditions for Widows' Benefit -
                            |
                            |
                         Widow -
                            |
                            |
               Husband's contributions OK -
                         /     \
                        /       \
       Under pension age +     Husband not entitled to retirement pension +

Again, you probe for further conditions; if there are none, you close the path, and recurse back.

OK, so we've got someone who is a widow and whose husband's contributions were OK, but was not under pension age when bereaved, and their husband was entitled to a retirement pension. Is there anything else that would make you think they satisfied the conditions for widows benefit?

No.

Again, we've nowhere further to go from this node, so we recurse back up:

OK, so we've got someone who is a widow but whose husband's contributions were not OK. Is there anything else at all which would make you think they satisfied the conditions for widows benefit?

No.

Again recurse back up:

OK, so we've got someone who is not a widow. Is there anything else at all which would make you think they satisfied the conditions for widows benefit?

No.

And so we can close this tree: it's complete. As you can see, this is a very simple procedure: the interviewer always knows what the next question to ask is, and the answers can be systematically recorded. So it's very easy to teach, and people easily understand how this process maps on to real world decision making. There may still be some things which aren't completely clear - for example, what is entailed by 'Husbands contributions OK'? But we can use that as the start of a new tree and repeat the process. We had special notepads printed with a triangular grid to make drawing trees easier, but that was mainly sales collateral; essentially anyone who can ask questions and scribble on paper can conduct one of these interviews.

So, again, I didn't have the original idea, that was down to Peter. But I did develop the interviewing practice, and I was the person who mainly delivered the courses on it.

The nature of logic is that you can operate on it with logic. A logical formalism can be translated into another logical formalism with similar (or lesser) expressivity, so it was trivial to translate D-Trees into production rules which would run on a conventional backward chaining production rule engine. I'd also argue, from experience, that D-Trees are easier for non-logicians to understand than production rules, since a non-specialist can assess whether a D-Tree has complete coverage of an area of domain knowledge, whereas it's much harder to assess whether a corpus of production rules does.

So to come back to your original question, yes, I know from experience that you can enable 'ordinary people' to understand inference, provided you can find a simple enough expression of it. Of course the D-Engine logic wasn't as sophisticated as a first-order predicate logic, and certainly wasn't as sophisticated as a constraint propagation logic, but I don't believe those are insuperable issues. You can't expect 'ordinary people' to understand dense code or complex formalisms; upside down 'A's and back turned 'E's, and all the various hooks of set notation, quickly alienate those who are not mathematically inclined. But all inference is the systematic performance of a limited number of individually simple legal moves, like chess; and 'ordinary people' can understand chess, even if many are (like me) not very good at it.

No comments:

Creative Commons Licence
The fool on the hill by Simon Brooke is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License