Minutes of Aug 11 PASREC Meeting - DRAFT
Attendees - IG, JK, EB, PC, PM, GW, QL, HG
This meeting is intended to cover in detail the strengths and weaknesses of Nirvana, and its deficiencies per the Run II requirements. Its basis is Philippe's evaluation of Nirvana wrt the PASFRG and PASSUMA requirements.
The group consensus is that Philippe's estimate of 3 man months for completion of the scripting language is probably too optimistic. It was stated that this time frame is for the functional part only. The ability to link in so will take a lot of work. Gordon has done some of this with C++. Doing it with C will be easy. A lot of linking details will be hard. For example, different compilers give different names to compiled objects, so such things can't be linked together. The three month estimate is for a C api to python only. It will provide only the current Histoscope functionality. Pasha also stated that the scripting language needs a dictionary builder, and that he believes the SWIG (Python) is harder to use that cint. He suggests that a comparison of the two would be best. Gordon contends that what is needed is some minimum functionality, and that cint lacks it. Finally, Pasha asked if 2 histograms could be loaded, then chained/merged? Philippe said no, you must load both, or write custom code. Irwin and Gordon said that this would be pretty easy via C.
Irwin conjectured that we maybe want a new language on top of Python.
Discussion turned to the ability to save and recall a configuration. . Irwin brought up the fact that the pasfrg requirements mandate access to data from previous analysis session. The proposed Python interface will provide this capability. At present the saved configuration is a snapshot, black box (unable to be edited). Will command line interface do same thing? Pasha mentioned that PAW will store a subset histograms from an Ntuple, whereas Histoscope needs to re-read the entire original stores. This makes resuming a Histoscope session much slower.
Some more basic functionality is also missing, like the ability to .OR. two cuts.
As for the framework to glue the Nirvana pieces together, another 6 man-months is estimated. This also sounds pretty optimistic.
A lengthy discussion of the data model took place. The long-awaited column-wise ntuple is almost ready - at present the program can read and write the, but, not visualize them. Pasha said that even the Histoscope 5.0 spec contains only HBOOK plus a GUI. It was stated that this is completely unacceptable. The need for very object-oriented representation was mentioned (the need to have arrays of "tracks", which may be combinations of float's, int's, …). What is really needed is to represent arrays of structures. Philippe said this is probably not possible, that fragmenting/reformatting will be necessary. Also useful will be objects within objects, and the ability to apply mathematical functions during the analysis session. It would be nice to have varying number of varying structures. Philippe said this can probably done if the maximum number is known.
The viability of a D0OM to Python/Histoscope interface was discussed. Gordon's view is that if the underlying representation is C++, this will probably be pretty easy. If it is Python, it will be difficult. It is probably possible to write Python objects to do such things.
Qizhong asked if there are any users of Nirvana for offline analysis. Apparently there aren't, at least in a large group (no experiments).
The question was raised whether there is anything that Histoscope can do better that ROOT. The consensus is that the Histoscope UI is much better, but that ROOT is better at everything else. This led to a final topic, namely a possible bridge between the two products. In other words, is there a way to combine the ROOT I/O, obj model, and cint with the histo gui? Perhaps a Nirvana layer over ROOT?
Summary -
Command line interface in progress, with schedules for completion in slides.
Object model - right now it is heading toward HBOOK, which is insufficient. It is easy to interface the models at the Python level, but the analysis tool needs to use same model. This will be much work. Philippe said best available in a timely fashion is probably variable length vector. This is insufficient.
Next Meeting: Friday (MATLAB)