Eileen Berman, Philippe Canal, Frank Chlebana, Irwin Gaines, Herb Greenlee, Jeff Kallenbach, Rob Kennedy, Qizhong Li, Pasha Murat, Gordon Watts
This report contains the results of the Run II Physics Analysis Software Recommendations committee (PASREC). It contains a summary of the work completed by the committee as well as a recommendation for Run II analysis software.
I. Introduction
:ROOT:
NIRVANA:
Commercial products:
Shareware:
III. Detailed Product evaluations from PASFRG and PASSUMA criteria
:ROOT - See Appendix 1
Nirvana - See Appendix 2
MATLAB - See Appendix 3
IV. Conclusions
:Either ROOT or Nirvana could meet the functional and support requirements
We recommend that ROOT be adopted as the standard physics analysis package for Run II, contingent on a collaborative agreement with the ROOT team. It should be recognized that this recommendation depends critically on timing and on sharing development with outside collaborators, and the steering committee should assess the validity of these assumptions in evaluating the recommendation. In particular, if the requirement for an immediate choice is being driven by on-line needs (which may not require the full functionality of an off-line analysis package immediately), it needs to be determined if the components of NIRVANA that already exist are adequate for the immediate needs.
It is highly likely that by the end of RUN II (or by the time of the LHC) that commercial components will be heavily used for analysis tasks. Commercial offerings should continue to be investigated and made available (perhaps on limited platforms). The Computing Division should also initiate formal collaboration with the LHC++ project so as to have some influence on the choices made and direction taken. These two initiatives, while lower priority than the immediate ROOT support and development needs, should position us to take full advantage of expected evolution of these products.
Scripting Language: C/C++ interpreted language (CINT)
User Control:
Data Selection:
Offline Compatibility:
Prototyping:
--------------------------------- PasSuma Checklist - ROOT ---------------------------- Contributors: - Rob Kennedy 31-July-1998 - several additions by Pasha Murat (Aug 04 1998) ----------------------------
a) What is the customer base and what is their experience and opinion? For commercial software or non-HEP freeware, one should get a list of customers and references.RDK) The customer base is primarily HEP experimenters and support personnel, with a number of experiments officially using ROOT as part of their data handling mechanism or design. For example, here is a list of applications and links to ROOT located at: http://root.cern.ch/root/ExApplications.html NA49 ROOT Physics Analysis Classes ROOT Primer by Soren Lange ROOT at GSI More 3D visualization for the CMS Track Reconstruction Prototype Clusterization in the CMS ECAL The PHOBOS Analysis Toolkit The E907 experiment SAL Scientific Applications On Linux The Cetus Links The Rosebud Package ROOT used for event monitoring in the Finuda experiment ATLFast++, the ATLAS fast MonteCarlo gh2root: Generates C++ classes to convert Geant3 KINE/HITS to ROOT Direct Photons produced at RHIC energies ROOT in STAR (large heavy ion experiment in Brookhaven) The ALICE simulation/reconstruction framework PM) more customers: - STAR@RHIC decided to proceed with full-scale evaluation of ROOT as a CERNLIB replacement - CDF activities: - many physicicts are trying to use ROOT for analysis - prototyping of ROOT-based online consumers - simulation project is prototyping ROOT tools - CDF SVXII test stand is writing the data out as ROOT ntuples - CDF Karsruhe group is prototyping a ROOT-based interactive event display b) How long has the product been in existence? What version is the product at? How many major releases have there been? How often is there a minor release? Several major releases or regular minor releases with integrated bug fixes are good signs of a well supported mature product. Availability of published books on the product are also a sign of maturity as well an established customer base. RDK) The ROOT project was started by Rene Brun late November 1994. His long time collaborator Fons Rademakers joined the project around January 1995. In the middle of August 1995 Nenad Buncic joined the team, followed by Valery Fine in December 1995. Masaharu Goto created and supports CINT, including the ROOT variant of CINT, RINT. RDK) The current version is 2.00/09 (ROOT's style of version numbers). There have been two major releases that I know of (1.0 and 2.0), and patches appear about every 2 to 4 weeks. The patches appear to be roughly equally divided between developer-realized issues and user defect reports/feature requests. c) How long will the product survive? Are there any competing products that are likely to win the market (including freeware). Who is the product developer and are they well supported financially (graduate student or full time staff). RDK) I recall that Pasha had a statement from Rene which was a committment to support ROOT until ?. It is unlikely that commercial packages will replace the demand for this product. Afterall they have not succeeded to date, and they predate ROOT. With less certainty, in my opinion, it is unlikely that freeware packages will replace ROOT either, since ROOT offers significant functionality not found in other packages, such as interactive object browsing. ROOT is developed by the "ROOT team" (see above) which consists of at least two full-time developers, two part-time developers, and a number of specialists working on specific aspects of ROOT as time permits. PM) ROOT team has an excellent record and many years of experience with HEP software. R. Brun was the leading developer of CERNLIB, F.Rademakers for several years has been maintaining CERNLIB, V.Fine ported CERNLIB to PC/Intel architecture (DOS/Windows 95/Windows NT). The team is very productive. ???) Financial backing
a) Who provides consulting support? Commercial, other Lab, CERN, Fermilab? Are they responsive? Newsgroups and dejanews may provide some information on support response (though these tend to be biased). This is rather subjective and should be treated as such. RDK) Consulting support is primarily provided by Rene Brun and Fons Rademakers, with FNAL local consulting unofficially provided as time permits by Pasha Murat. In everyone's opinion I have spoken with, the ROOT support turn-around after contacting them is good to outstanding. To get a better idea of the support activity, see the ROOTtalk e-mail archive at: http://root.cern.ch/root/roottalk/AboutRootTalk.html b) Who can get support? Particularly for commercial software, can any user of the product access the support services or are these limited to a pre-specified list of local contacts. RDK) Anyone can get support. There is no requirement to sign up or provide personal information before one can post to ROOTtalk, though one can do so to receive the e-mail and responses themselves in your e-mail browser. I prefer to use the WWW interface to ROOTtalk myself. Presumably, if ROOT is select by FNAL, then some local support apparatus might be set-up to help alleviate the support burden on the ROOT developers (as we have done with Kai, with C++, and other topics). c) Is the use of the product in the community enough that there is a pool of people/knowledge to draw from for support if needed? HEP use should be assessed; PAW knowledge in the HEP community is widespread and Root is growing. A dedicated newsgroup would be a plus. RDK) Yes, in my opinion, there are enough users of ROOT on different operating systems (Unix and Windows) to provide a knowledge base outside of the ROOT (support) team itself. ROOTtalk acts like a newsgroup, though the exact mechanism is different (http://root.cern.ch/root/roottalk/AboutRootTalk.html). ROOT in many ways can be used as an overhauled PAW, though the PAW model of data analysis does not seem to lead to the most efficient use of ROOT. PM) ROOT user community is by 99.9% HEP community. Practically in all the US HEP laboratories including BNL, LBL, LLNL, FNAL, SLAC there are physicists using ROOT. There are ROOT users at CERN and in Russia, again - in HEP community. From the point of view of accumulated knowledge of the product, BNL and FNAL are already capable of providing the local support. d) Is user training needed and available? What is the cost? RDK) In my opinion, user training is needed, though there are many documents available to help users understand and use ROOT. I think a five page tutorial on how to start, interact with, and quit ROOT would be a good complement to existing documentation. Also needed is English documentation on ROOT CINT, as well as one page tutorial on what is known to work and what fails with CINT. The cost of the five page tutorial would be very small. The cost of the CINT documentation may be an FTE week or two of someone's time who is familiar with CINT. PM) Many ROOT commands have their PAW counterparts (hist/plot vs hist-Draw(), for example), so PAW users adapt to "ROOT philosophy" pretty easily. New commands/tools require more training, mostly in C++ itself. I also heard comments that using CINT makes it much easier for physicists to make their first steps in learning C++. e) Is training required and available for support staff? What is the cost (time and money)? For commercial products such as SAS and IDL, support and user training may be required to optimally use the product, and that cost should be folded in. RDK) No specific training is required for support staff. ROOT includes documentation of its internal data formats and implementation classes (not visible to the user). Some of the mechanisms in ROOT are non-standard (especially RTTI), and will a few FTE weeks to document more completely for the support staff. f) How much (local) support will be required (is it complicated and hard to use)? This and the remaining questions in this section can be determined by talking to current users or scanning any newsgroups, mailists or FAQS. RDK) This depends heavily on how we plan to improve ROOT and CINT locally, and how many of its limitations will be fixed by September 1 or we will simply accept. Many new users have been productive with ROOT in a few days to a week, but many heavy PAW users have found the transition in thinking to make using ROOT seem complicated, tedious, and bug-prone. I think that once ROOT is selected and a much large number of users have adapted to it (provided examples of analyses for others to use as examples), then the need for local support will predominantly be for adapting ROOT to OS/compiler combinations not supported officially by the ROOT team, and to add staff to the ROOT team to handle uncovered defects and implement new features. Perhaps local support could start out as dedicated ROOT testers to have teh most impact on ROOT's quality. g) For commercial or freeware, what kind and quality of user level support is provided? RDK) Via ROOTtalk, individual users interact via e-mail directly with Rene and Fons. One does not always get an instant fix, but one does get a thoughtful intelligent reply to your e-mail. In some cases, other users who know the answer will step in and answer your e-mail. PM) there are 2 mailing lists - ROOTTALK and ROOTDEV - the first intended for general discussion, the second one - for bug reports. Most of the users use the 1st list for all the purposes. h) Is the software completely and well documented at the user level? RDK) In my opinion, the software is well-documented as to what it *should* do, but not necessarily as to what it is known to be capable of doing right now. This is in part due to the development style of the ROOTteam, which emphasizes the goal functionality without listing what subset of this has been fully "certified" as operational. The documentation is not as comprehesive and reference-oriented as some commercial documentation I have seen, but it ranks very well against other freeware package documentation. PM) the ROOT software is extensively documented, the documentation system is source-based, in this sense it is more developer-oriented. What could be improved is the documentation for the beginners (including non-experienced C++ users) i) Is a system manager required in order to install and/or maintain the package? If so, this would be significantly complicate matters for some remote users who do not have ready access to (or a friendly relationship with) the system managers of their computer. RDK) A system manager is not required to use this, but we probably will distribute this from FNAL though the UAS UPS/UPD model, which implies that a "products" support person will probably install the UPS root product on machines. Some machines include alternative "products" area administered by a normal user, bypassing the requirement that someone have access to the "products" account. ROOT is compatible with this approach too.
a) What types of licensing are available? RDK) There is only one license, making ROOT free for non-commerical use. b) What is the cost? For Universities? Lab? RDK) ROOT is free for non-commercial use for everyone. If you want to pay for it, I am sure that the ROOTteam will accept donations. They seem to be very willing to accept computer accounts on machines with compilers that allow them access to different OS/compiler combinations (cdfsga and Kai C++, for example).
a) Who provides maintenance both local and external to the Lab? What are the fallbacks (if the maintainer(s) is run over by bus or the company folds)? RDK) Currently the ROOT team and associates provides all the maintenance. There is no reason to believe users (FNAL, for instance) cannot contribute to maintenance and have changes rolled back into the ROOT repository. For now, however, one must learn CMZ to do this, which inhibits users from working with the source code to overcome locally discovered problems. I tried to get ROOT to work under Linux2 with KCC v3.2 (local build with debug symbols) and made little progress. RDK) Since the source code and build procedures are available (though we would like to see them moved out of CMZ with kumacs and into CVS with makefiles), anyone can provide maintenance. For now that is almost entirely the ROOT team and associates. If Rene and Fons were on an ill-fated airliner, a collaborative mainenance team could be formed which would function, like the EGCS compiler development "team", using a world-readable CVS repository. Clearly this would be different from having two or more maverick coders turning out 100 new lines per day (my groundless guess), but the product would survive the transition. PM) Users from BNL started actively contributing into the code distribution. - S.Adler(BNL) generated rpm's for i386 and Alpha linux'es - D.Morrison (BNL) generated ROOT distribution tar-file based on GNU configuration tools - autoconf, libtool, and automake. b) Are the maintainers responsive and are bug fixes turned around in a reasonable amount of time? RDK) In my and other's opinion, yes. Some defects unrelated to core functionality take longer to get fixed, but this is a reasonable choice on the part of the ROOT team. c) Does the software maintainer need additional training (beyond that needed by users). If so, is it available and at what cost. RDK) Right now, today, a maintainer must learn CMZ. I have done it with help from Pasha, Pasha has done it, and we would not wish such on our fiercest competitor. With a move to a CVS repository and makefiles, this burden will be eliminated. ROOT is a diverse package, though. It includes elements of Graphics, HTML, Postscript, data structures, complicated RTTI, statistics, and basic data presentation. No one person here is likely to be able to cover all those subjects at an expert level and maintain 100% of ROOT. It will cost some time to familiarize those expert in a subject with the source code in ROOT related to that subject. d) What is maintenance/licensing costs for commercial products? RDK) Maintenance is free. It would not hurt to give them an account on your machine if you are working with an OS/compiler version to which they do not have ready access. e) How much software is there (line count)? How much needs to be supported locally (how many people required)? Can/should support be split up into areas of expertise (e.g. motif/graphics, interpreter, etc.). This is mainly significant for non-commercial software that will be maintained locally. RDK) "The ROOT system consists of about 480,000 lines of code (390,000 lines C++ and 92,000 lines C). The C language is used in CINT and in pieces of public domain code that perform specific functions like, terminal I/O handling (Getline), data compression (Zip) and the 3D interactive interface to X/Windows (X3D)." Also, much of the C code is the result of translating F77 (MINUIT, Simluation packages). RDK) The support can be split up into areas of expertise fairly easily. All of the maintainers would have to understand at some level the basic infrastructure: memory management, data structures, IPC services, and so on. Beyond that, the modularity in ROOT is based on high-level areas of expertise. Here is a text translation of the "ROOT System Tree" to give some idea how this is organized. Roughly each subject below has its own library of classes. NA49 | RINT (ROOT CINT) CINT C++ Interpreter / | Detector Description User Interface Components Minimization | | | Geometry Rendering\ Formula Evaluation | \---\ | Style Management Containers Ntuples | | | | 3D Graphics | Object I/O Trees | | | | 2D Graphics | | /Histogramming | | | | /--------/ | | Postscript Object Runtime Services IPC Services | | | | | X11/Windows/Mac Interface Memory Management OS Interface f) In the case of commercial software, is source code available (in escrow)? This would be required for finding bugs locally or in case the company folds. This may be an additional cost. RDK) Source code is available, although it is maintained in a CMZ repository.
a) What kind of build environment is provided. Is it robust? This is mostly relevant for non-commercial software that may need to be co-maintained. RDK) The maintenance and build environment is CMZ (CERN Patchy combined with CERN Zebra). It is robust, but very clumsy and tedious to use and arcane in its user interactions. In some cases, I have had to put a a symlink to the Kai compiler in order to get CMZ to recognize where it is. Surely there is a way to avoid this, but the symlink was faster than finding and reading CMZ documentation. This is completely unacceptable, and would not be too difficult to change (just time-consuming). b) Can the package be built AT ALL on new or different sub-systems? Root still provides NO makefiles. RDK) ROOT is beginning to include makefiles for some selected OS/compiler combinations as of 2.00/09, but not many. Once one learns CMZ, and is willing to edit code within CMZ <<, then one can adapt ROOT to new or different systems. Once ROOT moves to CVS with makefiles, this will be much, much easier. c) Is the source repository accessible so that local support persons can select which changes to accept and which to reject for local use? Root still uses CMZ. RDK) The primary source code repository is not available to the general public. One can, if one knows where to look, get a *copy* of the CMZ file containing all the source and build procedures for a particular version ROOT. One cannot tell from the filename, however, *which* version of ROOT a particular CMZ file contains (a convention problem from the ROOT team). This should all be changed to an open CVS repository as is used for Egcs compiler development. d) Will the software have to be maintained and/or extended locally and externally? If so, can the software be maintained in a common repository. If separate repositories, what is the commitment to keep them from diverging from modifications, extensions and bug fixes. This excludes locally maintained extensions which use pre-defined APIs or hooks into the product, which we will have to maintain ourselves in any case. RDK) A small FNAL ROOT team might port ROOT to new OS/compiler combinations that the ROOT team does not have access to and make high priority modifications (fix shared memory access for online monitoring programs during data-taking). RDK) We should develop a model with the ROOT team that is somewhere between a single shared repository (Egcs model) and a sub-ordinate repository where changes are fed back to the "central ROOT team" for consideration for inclusion in the master repository (CLHEP - FNAL Zoom model). I do not think we should allow or tolerate divergence in separate sibling repositories (BaBar - CDF Framework collaboration model) because resyncing the ROOT repositories will be an overwhelming task which will make permanent divergence seem like an economic alternative. ROOT as a product does not exclude any of these models once it is moved to a CVS-based repository. For now, maintenance and development collaboration with CMZ as the repository does not seem very economical as all local personnel would have to be trained in using CMZ effectively in a collaboration. e) Is the software passed through quality assurance software such as Purify or Insure++ before being put into production? RDK) To my knowledge, ROOT has not been "Purifyed" or "Insured". It is clear that ROOT leaks memory, for instance. Its own statistics show that clearly. ROOT sessions end with a memory allocation/de-allocation histogram which documents large number of memory leak, some of considerable size. I do not know how ROOT will be judged by a C++ f) Are there any restrictions that would prevent the product from being placed in the run II infrastructure (i.e. UPS/UPD)? In particular, the ability to support more than one version of the product on the system. RDK) No, there are not. I am doing exactly this with ROOT v2.00/08 in support of CDF's Event I/O facilities. ROOT depends on a single environmental variable, ROOTSYS, to indicate the base of its internal file system. LD_LIBRARY_PATH is sometimes required to be set on some OSs. This is very easy to implement in the UPS/UPD framework. g) Are release notes and change lists provided with releases? For example, the commercial product IDL comes with "what's new" and release notes lists. RDK) Yes, there are good "CHANGES" files available on the web to allow users to determine if a desirable patch is in a new version before downloading and installing a new version of ROOT. They can be found at: http://root.cern.ch/root/Availability.html underneath the version numbers at the top of the page.
a) Are there active mailing lists/FAQs/newsgroups for the product? How do they reflect on the product? Root has a support list, the commercial product IDL has FAQs and a newsgroup. RDK) ROOT has an e-mail support list ROOTtalk which includes a search engine on the e-mail digest, http://root.cern.ch/root/roottalk/AboutRootTalk.html. b) Are recent releases extensions/enhancements and not bug fixes? RDK) My impression from reading through the CHANGES files for the last few minor releases is that there is roughly a one-to-one ratio of added features/classes and reactive modifications. IBy reactive modifications, I mean changes to existing code which does not add functionality, such as defect fixes, run-time improvements, and minor design-related changes. c) Are product releases reasonably paced and useful? RDK) There have been two major releases that I know of (1.0 and 2.0) in the last 15 months, and patches appear about every 2 to 4 weeks. The patches appear to be roughly equally divided between developer-realized issues and user defect reports/feature requests. Depending on the features in ROOT which you exploit, the patch releases may or may not be immediately useful to you. PM) For example, this spring a concept of multifile tree has been introduced, in the coming release (2.11) we expect to have a new LaTeX interface for writing the formulas
a) Will the tool/software need to be upgraded (additions/replacements) to satisfy Run II functional requirements, and how difficult will this be? Does the product provide API/hooks to easily interface locally written extensions? Would existing support be able to handle this or would manpower need to be added? For example adding a command line (e.g. Python) to Histoscope is thought to be difficult. Anticipated additions and replacements should be identified. RDK) To meet the functional requirements, ROOT will at least need some work done on CINT to make it more standards-compliant and more robust. A complete C++ interpreter to replace CINT would probably require acquiring a C++ front-end (from EDG, US$60k) and applying roughly 2 to 4 FTE years of effort. More modest approaches to improving CINT would require less effort, but are not now well-defined. *** This answer should be better developed *** RDK) ROOT provides enough hooks to allow locally written extensions to be used with ROOT in most cases. It would still take a fair amount of effort, however, to replace an existing ROOT "module" like Linear Algebra with a locally developed solution, due to ROOT RTTI expectations and ROOT module interdependence. PM) There are "2-way" hooks available: - ROOT C++ classes could be used in the offline/online code, - the existing offline shared libraries culd be loaded in dynamically and be used within the ROOT interactive framework b) Is the product modular? Is it broken down logically and physically into reasonably distinct sub-systems? Can some of these sub-systems be replaced by external packages with the same functionality? For example, in Root can Linear Algebra or Minimization be done by packages developed specifically to solve these topics separately from Root? A mini code review should be performed on the package to determine what would be involved in replacing such an identified component or sub-system. RDK) ROOT is very modular, but some modules are also fairly interdependent. It would be very difficult to remove some ROOT modules and re-use them outside of the ROOT framework (which is not what is meant here, of course). It would take a fair amount of extra effort to take a module like Linear Algebra and install a replacement which has all the "ROOT-like" functionality as the original, especially in the context of ROOT RTTI which permits interactive object browsing. Another issue is that the ROOT Linear Algebra module has an interface which must be preserved for other ROOT modules to continue to function, thus probably requiring a replacement to use some interface adapter layer before installing it. c) If new functionality needs to be added, is the software sufficiently modular such that the code changes can be localized? For example, it is believed this is not the case for adding STL support to cint. A mini code review should be performed to assess whether extensive and/or destabilizing code changes would be required to add the functionality. RDK) Most of ROOT, in my experience, appears to have fairly localized functionality, allowing extensions to be fairly easily added. CINT is the obvious exception. CINT is poorly and irregularly organized compared to C/C++ compilers, lacking distinct parsing, symbol table management, and action code. Further its fundamental design is flawed in that the parser itself is the wrong variant to handle C++ syntax efficiently, requiring the syntax to be implicitly expressed in coded procedures instead of an easily editted, conceptually clear, grammar. d) Is the software sufficiently modular to be such that bug fixes are localized? A mini code review focused on a particular section or component of the software should provide information on this. RDK) Most of ROOT, in my experience, appears to be sufficiently modular, allowing bug fixes to be fairly easily added. e) If a component needs to be replaced in it's entirety, is the software sufficiently modular such that a new component can be slotted in with minimal disruption? For example, Root depends on functionality in cint other than the interpreter in a fundamental way. RDK) Due to ROOT RTTI expectations, it would not be trivial to drop in module replacements, but neither would it be technically difficult. The exception to this is CINT, which has a "broader interface" to ROOT than a simple C++ interpreter. CINT passes additional information about objects to the rest of the ROOT infrastructure that a "traditional" C++ interpreter would. Also, it is not clear that the ROOT-CINT interface is well-documented. PM) It is important to understand that the dependence discussed provides many unique features not available in other packages, for example, ROOTCINT is used for automatic generation of dictionaries, which makes it trivial to hook up any external code. f) What distinct (external) packages or interfaces are required to build and/or run the package... Motif (shared libraries), OpenGL, etc. Are there external software components that are out of the maintainer's control? LHC++ depends on numerous commercial packages, Nirvana/Histoscope on motif. All such packages and interfaces should be identified. RDK) No external packages are required to build and use ROOT, though OpenGL appears to be capable of being used with ROOT. ROOT does not require Motif, 3D X11 packages (one is supplied), or any other commercial/freeware package to be supplied by maintainers or users. ROOT does supply, integrated into its source tree, several freeware packages which it does use.
a) Platform availability: Linux, NT, Solaris, IRIX, HP, DEC-Unix ... ? If a specific Run II platform isn't supported, what would it take to get support for it should be determined. RDK) All Run II platforms are supported by ROOT. We (Pasha) have asked the ROOT team to support for the Kai C++ compiler, and they have done so with our providing accounts for ROOT developers on appropriately equipped platforms. b) Porting of code to new platforms; this applies to non-commercial software that is currently only supported on selected systems. The issue to be raised is the ability of the original developers to accept changes to be incorporated into the base code so any porting done here is done once (aside from effects of future OS upgrades) and does not have to be re-done with each release of the software. RDK) Because ROOT is already supported on many more OSs than are being considered for Run II, I do not think porting to new OSs is a potential problem. Since they have already ported ROOT to a (relatively) standards-compliant compiler, I do not think that porting ROOT to a new standards-compliant compiler is a potential problem. c) How sensitive is the package to minor OS and/or compiler and/or system header variations? Root under Linux may be sensitive to which C-libs are in use, which distribution you are using, which kernel you are using, which system header patches you have applied (esp., cint), and so on. PAW is sensitive to OS upgrades. This can be determined by looking at support history in newsgroups, other support logs, or talking to the user community. RDK) This sensitivity of ROOT to minor OS/header changes was definitely a problem with ROOT v1. Since then, ROOT has adapted to the same Linux distribution that FNAL has chosen to support, and distributes ROOT for old and new versions of the Linux C library. This does not mean that changing an important system header will not affect ROOT, just that we do not now see a problem with the changes I recommended to the FNAL Linux distribution. d) 64 bit considerations: Does the software run on 64 bit platforms/OS (alpha/Unix, SGI/Unix, future, e.g. merced)? Will it be difficult to port to 64 bit systems (a la COMIS for PAW). RDK) The C++ code in general should be 64-bit clean. I do not know if the C and F77-converted-to-C code is, but I suspect it is or can be easily adapted to be 64-bit clean. I wonder about the implications though of no longer being able to convert ntuples from HBOOK to ROOT since Zebra does not function on 64 bit systems. Note that while Dec Unix and SGI IRIX are both 64 bit systems, we run both with 32 bit pointers (Dec) or in 32 bit mode (SGI) largely because Zebra does not function correctly on a complete 64 bit system. PM) The most system-dependent part of ROOT system - CINT - has been ported to 64-bit IRIX architecture at BNL. e) Are there Endianship and other heterogeneous environment considerations? RDK) ROOT currently only support big-endian IEEE (IEEE floating point) files. That is a reasonable choice during their rapid development phase; Trybos at CDF has made the same choice. Nevertheless, we should require that ROOT support little-endian IEEE files in the future to improve performance on little-endian systems, such as Linux/Windows based on Intel chips and Dec Unix based on Alpha chips. f) Is the software product build dependent on a specific compiler or is it compiler independent? If compiles are not needed for the product, are there any compiler dependencies present in the API used for locally written extensions? In particular, if the software needs to be built with Run II compilers, it should be verified if it can. RDK) The ROOT build procedures are mildly dependent on the compiler in use, but not much more so than any other C++ product. Afterall, different compilers have different switches to express similar concepts. ROOT libraries, because they are built from C++ code, are specific to a particular C++ compiler and to certain switches (exceptions on/off with Kai C++, threads on/off with MS VC++). ROOT memory management is known to fail or misbehave on some OS-compiler combinations for various reasons such as a compiler not allowing overload of global new.
a) Are standards followed. Compiler standards? Library standards (e.g. STL, POSIX). Are they fully supported (e.g. cint and STL)? RDK) ROOT is relatively standard C++, and hides vast OS dependencies in OS interface modules (for Unix, Windows, Mac, etc). Currently, CINT has difficulty dealing with many advanced C++ features, however, and so certainly cannot be labeled standards-compliant. b) Are there any support and maintenance standards or procedures? For example, any control over what goes into releases? RDK) The ROOT team is small, and explicit written procedures for support and maintenance do not seem appropriate. c) Are good coding practices (documentation) followed? Is there good developer's documentation (how easily can the product be "taken over"?) RDK) ROOT follows a C++ style code which is reasonable and which they have published: http://root.cern.ch/root/Conventions.html. The documentation in general is vast, though several pieces are missing. A short 5 page "What to do when you first use ROOT" tutorial would be invaluable since users first experience with ROOT is through a clumsy command line interface (ROOT CINT) which has an not obvious command language ("How do I QUIT ROOT!"). Also missing is significant English language documentation on what language (hopefully a proper subset of C++) CINT does support. d) Are good "Computer Science" techniques and methods used, for example in a language interpreter (see below)? RDK) ROOT CINT is not an good example of Computer Science techniques. Its a long story, and perhaps Chih-Hao and Scott Snyder can contribute to fill this in from their PasFrg/PasSuma talks. e) Is there a design methodology applied? Are any design tools used? (such as a code generator). If so, do we need to have and/or support these tools? RDK) No higher level tools appear to be used in the development of ROOT.
a) Is there any security maintenance concerns with the product? RDK) There are security concerns only if we use one of the ROOT data server programs. These might be attacked to at least deny service, at worst to damage or alter data. ROOT developers may not be aware of buffer overruns within ROOT which could be exploited to at least "do anything" on a system which the "owner" of the server daemon has permission to do, at worst to "do anything" on a system, period. If we do not use ROOT server daemons, then the only security issues might be with IPC services used, but these are minor concerns related to whom on a system can access the data (in shared memory, on a socket) of another program. b) Is the product likely to crash and if so, how does it recover? What would be the impact? Do system managers need to intervene? RDK) ROOT does not in general use a master server daemon, so crashes do not generally involve system managers. ROOT has some facilities to recover from crashes, while a datafile is open for instance. Perhaps Pasha could elaborate? PM) ROOT output buffers are regularly (after each 1 GByte written, for example) flushed out. The autosave frequency is defined by the user. Autosave doesn't affect efficiency of the disk space usage. A datafile recovery procedure equivalent to that of HBOOK ntuple recovery is available. c) Any government regulation applied to the product? Export restrictions? RDK) No, there are no US regulatory problems to my knowledge. I cannot speak for other countries represented at FNAL. d) Are there any Y2K issues? RDK) ROOT does use a ROOT-specific time/date format (why oh why?), and it should be checked for Y2K safety.
a) Will an interface/adapter need to be made to fit the tool in with the rest of the analysis tool framework (e.g. data import/export). Does the product provide well defined and documented API/hooks for such an extension? RDK) No specific adapter is required if one uses ROOT as a physics analysis tool. ROOT has enough hooks to allow user-developed formats to be read/written which are compatible with ROOT data browsing at some level. CDF for instance is attempting to use these hooks to write ROOT-compatible event data files. b) Does the product use a language interpreter. If so does it support the full language. Is it written following correct computer science techniques and algorithms? Is it sensitive against language changes, etc. (examples are COMIS and cint, which support only subsets of their language). Or, turning the question around, is the language computational complete? (Then forget about the language it was trying to emulate and treat it as completely new language). RDK) ROOT uses a C/C++ interpreter called CINT. CINT supports a subset of the Standard C++ language, but it is not well-documented *what* subset of C++ is supported. Since it is not specifically documented what language is supported, it is impossible (if I understand the statement above) to say the CINT is computationally complete. It is not written using modern Computer Science techniques. For instance, there are no distinct parsing, syntax checking, and parse tree navigation phases. There is no easily-editted concpetually-comprehensible grammar description. RDK) One of the marketing statements from ROOT is that there is "only one language for users of ROOT to learn". This has clearly not been achieved. COMIS was not modern Fortran, and CINT is even less modern C++. Users must learn not only the feature of C++ which are not supported (or are poorly supported) by CINT, like templates, STL, C++ RTTI for dynamic dispatch, but they must also remember the C/C++ expressions which ROOT cannot handle (like ***p++, which in fact is not so absurd for certain coding styles). One cannot in general take "sophisticated" C++ code and run it with CINT, and sometimes "simple" C++ code fails too. c) Is the data processing model correct? For example, in MATLAB, the model is to read in a file then process all its data - will this work with Run II sized files. RDK) ROOT supports both extremes of data processing: read all data and work on it (efficient for small data sets), and read only a piece of the data, process it, and then do it again with the next piece of data. The user has control over, at least at the data model design level, where in between these extremes an analysis job will operate. PM) Conceptually ROOT continues the development line we refer to as to "CERNLIB" and it has very good chances to become a successor of the CERNLIB. STAR collaboration at RHIC started a full scale evaluation of ROOT not only as a PAW, but as a CERNLIB replacement. d) Are there restrictions on the input data (say size or format)? For example, in Histoscope the size of ntuples is limited because it is stored and processed in memory. RDK) Input data must be in ROOT format, or have some code translate the data on the fly into ROOT format. There is no practical limit to my knowledge on input data size, but this may depend on the data model in use. ROOT exchanges input data between disk and memory as needed. ROOT may be limited, depending on the data model, to the virtual memory size of a machine for a piece of an event/histogram (branch), for an entire event/histogram, or for a ROOT I/O buffer. Given the cost of memory nowadays, I do not consider such limits to be of any concern. e) What is the minimal environment to run the product? How does performance/capability scale up while the environment scales up? RDK) This may vary between Unix and Windows. For Unix, one needs simply a supported system, ???) For Windows?
Scripting Language:
A python interface will be added using these items in the design process. (3 man months needed)
User Control:
The Python interface will provide those.
Data Selection:
The Python interface will provide those.
Input/Output:
Numeric and Mathematical Functionality:
Additionnals high level functionnality can be provided by Python's add-ons.
Offline Compatibility:
All of those will be available for both Python and C.
Prototyping:
The Graphical Interface need to be upgraded to handle column wise ntuples, operations on multiple histograms, and improve the 'glue' between components (Histoscope, Nfit). (6 man months)
--------------------------------- PasSuma Checklist - Nirvana ---------------------------- Philippe Canal ----------------------------
Difficult to quantify. However NEdit, by the same authors with the same technologies is being used successfully by a large community. Python has a very large community of users and a large set of ressources. Eventually Histoscope and NFit could be distributed as python add-on, thus growing the user base.
Exist since 1991. New version have mostly been for added features and port to new operating systems. The newest version (alpha release of version 5.00) introduces column wise ntuple and a redefines some file operations.
Nirvana is 'owned' by Fermilab and will survice has long as Fermilab supports it.
Fermilab
Fermilab's 'customer'.
Yes it is already used by the HEP community but no newsgroup. Python has newsgroups and existing knowledgable person in the HEP community.
Might need training for Python. GUI is good enough to be learnt on the spot.
Available for Python.
Same as python (and the Nirvana group for Nirvana's specific parts)
Yes up to version 4. New features will need documentation.
No.
Free.
Free except for Fermi's man power.
Fermilab
Yes
No.
N/A
Around 130000 lines. The interpreter and the graphics part and the C implementation should separable to a large extend.
Yes
Ansi C and Motif. Standard makefiles.
Yes
Uses CVS
N/A
Yes
No
Yes
No
Yes
Will need an upgrade of the Graphical Interface and the addition of an interpreter (with Python the likely choice) (The latter comment seems to be a little hasty to me.)
Yes.
Yes
To a large extend.
Yes. Graphical Interface, interpreter and I/O are three sub-packages.
Currently need Motif and Minuit. Will need python.
Available on ALL platform as of June 98.
ok.
not very sensitive.
ready.
Nirvana writes Big Endian IEEE (IEEE floating points) files.
No
ok.
the few authors have entire control.
it's alright
yes
No high level design tool is currently used.
relies on .rhosts for some interprocess communication (actually rsh)
unlikely
no
not that i know of.
Yes.
Yes
Yes
Histoscope can now also have disk resident ntuples. The size limitation is now 2Gb.
I don't know for sure but it should be able to scale nicely.
Disclaimer - This report contains a combination of fact, observation, and opinion. It is our best understanding of the situation under our particular circumstances, and is not guaranteed to be correct or to apply to conditions other than those under which the evaluation was made. It is for the use of the Fermilab / HEP community only.
Scripting Language:
User Control:
Data Selection:
Input/Output:
Numeric and Mathematical Functionality:
Offline Compatibility:
Prototyping:
--------------------------------- PasSuma Checklist - MATLAB ---------------------------- Jeff Kallenbach ----------------------------
Disclaimer - This report contains a combination of fact, observation, and opinion. It is our best understanding of the situation under our particular circumstances, and is not guaranteed to be correct or to apply to conditions other than those under which the evaluation was made. It is for the use of the Fermilab / HEP community only.
What is the customer base and what is their experience and opinion?
For commercial software or non-HEP freeware, one should get a list of customers and references.There are a number of references on Mathworks' WWW pages. They claim to have around 400,000 users worldwide. I have asked for HEP references (users at SLAC, BNL, LANL, etc.) and am waiting.
How long has the product been in existence?
What version is the product at? How many major releases have there been? How often is there a minor release? Several major releases or regular minor releases with integrated bug fixes are good signs of a well supported mature product. Availability of published books on the product are also a sign of maturity as well an established customer base.The product has been around for ~15 years. Current release is 5.2.
How long will the product survive?
Are there any competing products that are likely to win the market (including freeware). Who is the product developer and are they well supported financially (graduate student or full time staff).There is a freeware version, called Octave. It is unclear whether this will really overtake MATLAB, though. The fact that the Octave graphics are still very crude indicates that not a lot of work is being applied to areas we consider important.
Who provides consulting support? Commercial, other Lab, CERN, Fermilab? Are they responsive?
Newsgroups and dejanews may provide some information on support response (though these tend to be biased). This is rather subjective and should be treated as such.
There are e-mail and phone hotlines for support questions. A local base of knowledge could easily bedeveloped to help with FAQ and other straightforward questions. The newsgroup is busy, with about 800 articles posted in the last two - three weeks.
Who can get support? Particularly for commercial software, can any user of the product access the support services or are these limited to a pre-specified list of local contacts.
Anyone at the site would be able to contact the hotline
Is the use of the product in the community enough that there is a pool of people/knowledge to draw from for support if needed?
HEP use should be assessed; PAW knowledge in the HEP community is widespread and Root is growing. A dedicated newsgroup would be a plus.HEP usage is unknown. The newsgroup is busy, and the q & a in there look healthy, ranging from beginner-level inquiries to refinements of graphics output. There seem to be plenty of users.
Is user training needed and available? What is the cost?
It is available and expensive (about $500/person/2-day course). From my experience the product is pretty easy to get started with using the doc.
Is training required and available for support staff? What is the cost (time and money)?
For commercial products such as SAS and IDL, support and user training may be required to optimally use the product, and that cost should be folded in.There is such a thing, but it isn't yet priced.
How much (local) support will be required (is it complicated and hard to use)?
This and the remaining questions in this section can be determined by talking to current users or scanning any newsgroups, mailists or FAQS.The product seems very easy to use. I would think though that a local working group or mailing list would be helpful, as we write our data interface modules and other HEP-specific functions. The volume of licenses that we will require some attention, as well as coordination with the license server people (fnalu-admin).
For commercial or freeware, what kind and quality of user level support is provided?
The helpwin facility and www-based documentation are outstanding. Response to technical questions from the hotline has been very good. The books and helpwin are full of examples. Getting started with the product is very easy.
Is the software completely and well documented at the user level?
Yes - see above
Is a system manager required in order to install and/or maintain the package? If so, this would be significantly complicate matters for some remote users who do not have ready access to (or a friendly relationship with) the system managers of their computer.
Not required, but recommended. Workarounds exist for the case where peons do the installation. UPS/UPD tailoring would be pretty easy.
What types of licensing are available?
Flexlm, floating seats exist for Un*x and NT (these are separate). However, any of our collaborators can use licenses off of our server. In addition, there is another breakdown of the Unix licenses into interactive (programmer's) and batch licenses.
What is the cost? For Universities? Lab?
The cost is very high. They are quoting ~$3000 per floating Unix seat per year, with 20% discounts for large volumes.
Who provides maintenance both local and external to the Lab? What are the fallbacks
(if the maintainer(s) is run over by bus or the company folds)?I would envision a pool of local knowledge (mailing lists, module repository) to supplement the hotline. AS a commercial, it would not be "maintained" by fixing the code, but by feeding questions/problems back to MathWorks and then updating the local installations. If the company folds we could go to Octave.
Are the maintainers responsive and are bug fixes turned around in a reasonable amount of time?
We didn't encounter any bugs. The support line was responsive in the case of our two questions.
Does the software maintainer need additional training (beyond that needed by users). If so, is it available and at what cost.
Probably not - just relaying bugs and fixes.
What is maintenance/licensing costs for commercial products?
Two ways to go. We can by a four year service agreement, in which case we pay for 4 years of upgrades/hotline/etc. for 20% of our purchase price. This includes perpetual licenses. Or, we can renew one year at a time, for 20% of the current price. In this case the license must be renewed each year.
How much software is there (line count)? How much needs to be supported locally (how many people required)? Can/should support be split up into areas of expertise (e.g. motif/graphics, interpreter, etc.). This is mainly significant for non-commercial software that will be maintained locally.
Probably N/A.
In the case of commercial software, is source code available (in escrow)?
This would be required for finding bugs locally or in case the company folds. This may be an additional cost.It is not available.
What kind of build environment is provided. Is it robust?
This is mostly relevant for non-commercial software that may need to be co-maintained.Standard build kits for Unix and Install Shield kits for Win32. Distributed on CD, we could distribute via ZIP files/upd kits probably.
Can the package be built AT ALL on new or different sub-systems?
Root still provides NO makefiles.Kits exist for all of our platforms.
Is the source repository accessible so that local support persons can select which changes to accept and which to reject for local use
N/A
Will the software have to be maintained and/or extended locally and externally? If so, can the software be maintained in a common repository. If separate repositories, what is the commitment to keep them from diverging from modifications, extensions and bug fixes.
This excludes locally maintained extensions which use pre-defined APIs or hooks into the product, which we will have to maintain ourselves in any case.I can see us sharing modules and controlling those however we choose. But as far as changing the distribution, it will not occur.
Is the software passed through quality assurance software such as Purify or Insure++ before being put into production?
N/A
Are there any restrictions that would prevent the product from being placed in the run II infrastructure (i.e. UPS/UPD)? In particular, the ability to support more than one version of the product on the system.
No such restrictions.
Are release notes and change lists provided with releases?
For example, the commercial product IDL comes with "what's new" and release notes lists.Yes - release notes are provided.
Are there active mailing lists/FAQs/newsgroups for the product?
How do they reflect on the product? Root has a support list, the commercial product IDL has FAQs and a newsgroup.All of this exists. THere is a newsgroup where users share ideas and code. The Mathworks WWW pages are very complete, with FAQ's and the like.
Are recent releases extensions/enhancements and not bug fixes?
Yes. Version 5.2 is primarily an upgrade.
Are product releases reasonably paced and useful?
Yes. Much work is going into Win32 versions.
Will the tool/software need to be upgraded (additions/replacements) to satisfy Run II functional requirements, and how difficult will this be?
Does the product provide API/hooks to easily interface locally written extensions? Would existing support be able to handle this or would manpower need to be added? For example adding a command line (e.g. python) to Histoscope is thought to be difficult. Anticipated additions and replacements should be identified.The only big deficiency I see is possibly with the data management. It is unclear how this will behave when fed a 2GB ntuple. Some code to help manage such data may have to be written. The API is excellent and well-documented. CD (PAT, I guess) would probably be able to handle the support with existing manpower. This would include getting distribution and licenses set up, upd'ifying the product, and helping the experimentors get going with their data interface modules. Also preferable would be management of local module base and mailing list. I see maintenence of this product requiring less work than a HEP-ware product.
Is the product modular? Is it broken down logically and physically into reasonably distinct sub-systems? Can some of these sub-systems be replaced by external packages with the same functionality?
For example, in Root can Linear Algebra or Minimization be done by packages developed specifically to solve these topics separately from Root? A mini code review should be performed on the package to determine what would be involved in replacing such an identified component or sub-system.I didn't actually do it, but I think such a thing could be done without much problem. For example, it would seem to be straightforward to replace (actually override) the plotting with OpenInventor graphics, for example.
If new functionality needs to be added, is the software sufficiently modular such that the code changes can be localized?
For example, it is believed this is not the case for adding STL support to cint. A mini code review should be performed to assess whether extensive and/or destabilizing code changes would be required to add the functionality.N/A
Is the software sufficiently modular to be such that bug fixes are localized?
A mini code review focused on a particular section or component of the software should provide information on this.N/A
If a component needs to be replaced in it's entirety, is the software sufficiently modular such that a new component can be slotted in with minimal disruption? For example, Root depends on functionality in cint other than the interpreter in a fundamental way.
??
What distinct (external) packages or interfaces are required to build and/or run the package... Motif (shared libraries), OpenGL, etc. Are there external software components that are out of the maintainer's control?
LHC++ depends on numerous commercial packages, Nirvana/Histoscope on motif. All such packages and interfaces should be identified.This package is self-contained.
Platform availability: Linux, NT, Solaris, IRIX, HP, DEC-Unix ... ?
If a specific Run II platform isn't supported, what would it take to get support for it should be determined.All runii platforms are maintained. We could share licenses between Un*X platforms.
Porting of code to new platforms; this applies to non-commercial software that is currently only supported on selected systems.
The issue to be raised is the ability of the original developers to accept changes to be incorporated into the base code so any porting done here is done once (aside from effects of future OS upgrades) and does not have to be re-done with each release of the software.N/A
How sensitive is the package to minor OS and/or compiler and/or system header variations?
Root under Linux may be sensitive to which C-libs are in use, which distribution you are using, which kernel you are using, which system header patches you have applied (esp., cint), and so on. PAW is sensitive to OS upgrades. This can be determined by looking at support history in newsgroups, other support logs, or talking to the user community.64 bit considerations: Does the software run on 64 bit platforms/OS (alpha/Unix, SGI/Unix, future, e.g. merced)?
Will it be difficult to port to 64 bit systems (a la COMIS for PAW).Works on 64-bit platforms.
Are there Endianship and other heterogeneous environment considerations?
There don't seem to be
Is the software product build dependent on a specific compiler or is it compiler independent? If compiles are not needed for the product, are there any compiler dependencies present in the API used for locally written extensions?
In particular, if the software needs to be built with Run II compilers, it should be verified if it can.N/A
Are there any support and maintenance standards or procedures? For example, any control over what goes into releases?
N/A
Are good coding practices (documentation) followed? Is there good developer's documentation (how easily can the product be "taken over"?)
N/A
Are good "Computer Science" techniques and methods used, for example in a language interpreter (see below)?
N/A
Is there a design methodology applied? Are any design tools used? (such as a code generator).
If so, do we need to have and/or support these tools?N/A
Is there any security maintenance concerns with the product?
None
Is the product likely to crash and if so, how does it recover? What would be the impact? Do system managers need to intervene?
I wasn't able to crash it on NT. There are reports of a crash on the newsgroup, with what sounded like a typical seg fault.
Any government regulation applied to the product?
Export restrictions?No government regulations. Foreign collaborators could probably make use of our licenses, but may cost more.
Are there any Y2K issues?
Will an interface/adapter need to be made to fit the tool in with the rest of the analysis tool framework (e.g. data import/export).
Does the product provide well defined and documented API/hooks for such an extension?Yes. The basic MATLAB data model is a double precision array. Interfaces will have to be written from experiments' data models.
Does the product use a language interpreter. If so does it support the full language. Is it written following correct computer science techniques and algorithms? Is it sensitive against language changes, etc.
(examples are COMIS and cint, which support only subsets of their language). Or, turning the question around, is the language computational complete? (Then forget about the language it was trying to emulate and treat it as completely new language).MATLAB has a very nice interpreter which can be learned pretty easily. In addition, the commands can be saved in scripts or compiled to shared objects. It is not the C++, though.
Is the data processing model correct?
For example, in MATLAB, the model is to read in a file then process all its data - will this work with Run II sized files.A problem - some codes will have to be written to handle runII sized files.
Are there restrictions on the input data (say size or format)?
For example, in Histoscope the size of ntuples is limited because it is stored and processed in memory.MATLAB has a similar restriction at this time.
What is the minimal environment to run the product?
How does performance/capability scale up while the environment scales upThe program (Win32 version) runs fine on a minimal PC.