Analysis Preservation: sharing best practice between experiments

Today I had the pleasure of giving a talk in the general meeting of the LHCb collaboration — something which is unusual for a member of the ATLAS collaboration!

Large scientific collaborations need to have private spaces where they can discuss away from the scrutiny of other experimentalists and theorists. There are some things which you’d prefer to discuss with your family before shouting it out in the street: it’s the same with groups of scientists. Discussing and reviewing preliminary results, problem-solving for issues with the detector, and writing/editing papers are all examples of activities which need a period of calm internal reflection before the finished product is displayed to the wider world. But occasionally, we invite each other to give topical talks so that we can learn from each other’s collaboration practices.

My talk today was to report on how ATLAS approaches the problem of analysis preservation with a view to re-interpretation. This is a topic which I have been rather involved with, and a few years ago I wrote the internal ATLAS Analysis Preservation Policy, based on the recommendations of the LHC Reinterpretation Forum. The basic idea is that after years preparing a physics result and writing the paper, an experimentalist can be tempted to just submit the paper to arXiv and the journal, and move on to something else. Unfortunately, that’s not really enough for the result to be re-used in the long term.

The problem is that our analyses typically concentrate on one model at a time. But there could be other models for which the analysis could have been sensitive because a similar final state is involved. Since such models might not even be invented yet, we can’t simply make our papers interpret their results in terms of all the possible models which it might be sensitive to. We also can’t afford to make a new analysis for each model which comes up (a paper takes 3-4 years from kick-off to publication). Instead, we need to make sure that we preserve the results of the analysis in such a way that one can do the interpretation after the fact, even 20 or 30 years from now.

This means preserving and publishing on the HEPData portal:

The numerical values from all the plots (might seem obvious but not always done!), potentially including the breakdown of statistical and systematic uncertainties;
A piece of computer code which encodes the set of selections made on events to choose the ones which enter the analysis (or at least enough information in the form of tables to approximate this step);
A piece of computer code encoding the statistical fit which was used to extract the final result;

These items are easy to write but in practice can be very hard to achieve, especially for searches for unconventional signatures such as long-lived particles. But we are making progress. The LHCb collaboration has its own structures and procedures, but there is always room to learn and improve. As we approach the end of LHC operations (a mere 15 years away), this topic is coming closer to the forefront of experimentalists’s minds. Indeed, there may already be searches and measurements made now which it doesn’t make sense to re-do later in the LHC programme. So getting analysis preservation right today is becoming critical. What we do in ATLAS is not perfect, but by exchanging information about what works and what doesn’t between collaborations, we can between us come closer to a set of best practices which will withstand the test of time.

Share this:

Related

Leave a comment Cancel reply