First off, Pronto 0.2 – is an optimization/bug fixing release meaning that there are no reasonably important new features (at least as far end users are concerned). We have very much been concerned with performance/scalability issues which were (and some still are) rather glaring in 0.1.
Here I must say that the success w.r.t. performance and scalability were quite different. There’re pretty substantial performance improvements, i.e. Pronto 0.2 is more predictable and few times faster than Pronto 0.1 when running on the same set of samples. Simply put, if Pronto 0.1 is capable of reasoning with some probabilistic ontology, Pronto 0.2. will do it much faster. How much faster depends on the KB.
Regarding scalability, the progress has been less impressive meaning that it still can’t scale substantially beyond ~15 generic (TBox) conditional constraints. Being a little more technical here: it’s polynomially better than Pronto 0.1 but that doesn’t entirely solve the problem with exponential complexity of probabilistic DL reasoning. More news about the ongoing progress in that direction is expected towards the end of summer 2008.
Now let’s turn to the nature of the performance improvements. The main is the new lexicographic entailment algorithm that replaced the original Lukasiewicz TLexEnt procedure [1] that was based on brute force search for the preferred models (recall that Pronto is a non-monotonic reasoner, so it infers knowledge from the set of preferred models of the KB, not from the set of all models). The new algorithm is based on the idea of carefully identifying and eliminating small pieces of the KB that can’t be satisfied by the preferred models. This is similar to computing maximal satisfiable (minimal unsatifiable) subsets of ontology – techniques that are used for debugging classical OWL ontologies. The algorithm helps to make reasoning a lot more predictable and robust, i.e. reduces the difference between the “easy” ontologies and “hard” ones that previously required 10-15x reasoning time (in spite of having the same size). More detailed technical description of the algorithm can soon be found in our paper that was recently accepted to ISWC’08 [2].
There are also other optimization techniques most of which try to reduce the excessive amount of classical DL reasoning performed during the probabilistic entailment.
Improvements (i.e. robustness and speed-up) are illustrated on the following diagram (taken from the ISWC paper). It compares the performance of Pronto 0.1 and Pronto 0.2 on a set of randomly generated probabilistic ontologies. Y axis shows the reasoning time and X axis shows the metric that we developed to assess “hardness” of specific probabilistic KBs before reasoning with them. Basically we wanted to understand what makes KBs hard for reasoning and it seemed that “connectivity”, i.e., number of all possible subsumption relations between OWL classes in probabilistic statements might matter (the higher connectivity - the easier must be the KB). The results show that in addition to simply being faster Pronto 0.2 is also more robust as there are no “hard outliers” – KBs that take a lot more reasoning time comparably to others of the same size. And it’s more predictable, i.e. by using the metric (which is fast to compute) one can roughly estimate reasoning time (if using Pronto 0.2).
I am hoping to make the evaluation results repeatable by other people. To achieve this we started a small project PREVAL_DL which is a reasoner-independent tool for evaluating the performance of P-SHIQ(D) reasoners. It can generate instances of probabilistic reasoning problems and run your favorite P-SHIQ(D) reasoner through them (Pronto and ContraBovemRufum are the only available so far).We are currently working on the scalability problem. Hopefully soon we’ll have a better idea of how large probabilistic ontologies can still be reasoned with. Scalability improvements after that point will likely be concerned with partitioning/decomposing probabilistic KBs onto smaller probabilistically independent fragments.
[1] T. Lukasiewicz, Expressive Probabilistic Description Logics, Artificial Intelligence, 172(6-7), 852-883, April 2008
[2] P. Klinov and B. Parsia, Optimization and Evaluation of Reasoning in Probabilistic Description Logic: Towards a Systematic Approach, accepted to the International Semantic Web Conference, Karlsruhe, October 2008