Towards a Global Analysis and Data Centre for Multi-Messenger Astroparticle Physics
The current state of distributed storage for astrophysics of cosmic ray is considered. The main goal of AstroDS is to unite existing astrophysical data storages of a number of existing experimental collaborations, such as TAIGA, TUNKA, CASKADE and others.
The paper presents a solution for decentralized management of data access rights in geographically distributed systems with users from different institutions. This implies possible lack of trust between the user groups. The solution is based on the distributed ledger technology (DLT) together with provenance metadata driven data management.
A few days ago we released the new KCDC version PENTARUS, which contains another DataShop for the first time. A brief outline of the new features will be given here, as well as possible future perspectives.
The current state of the metadata catalog for astrophysics of cosmic rays is considered. The method of integration with external data storages is proposed.
The Tunka Radio Extension (Tunka-Rex) is a cosmic-ray detector operating since 2012. The detection principle of Tunka-Rex is based on the radio technique, which impacts data acquisition and storage. We present the Tunka-Rex Virtual Observatory (TRVO), a framework for open access to the Tunka-Rex data, which currently is prepared for the first release.
The simulation of particle showers in electromagnetic calorimeters with high precision is a computationally expensive and time consuming process. Fast simulation of particle showers using generative models have been suggested to significantly save computational resources. The objective of studies is to perform a fast simulation of particle showers in the Belle II calorimeters using deep...
In this talk we will describe various data caching scenarios and lessons learned. In particular we will talk about local data caches configuration, deployment, and tests. We are using xCache, which is a special type of Xrootd server setup to cache input data for a physics analysis. A relatively large Tier2 storage is used as a primary data source and several geographically distributed smaller...
EOS is a CERN-developed storage system that serves several hundred petabytes of data to the scientific community of the Large Hadron Collider (LHC). In particular, it provides services to the four largest LHC particle detectors: LHCb, CMS, ATLAS and ALICE. Each of these collaborations uses different workflows to process and analyse its data. EOS has a monitoring system that collects detailed...
The KM3NeT neutrino detector, consisting of several building blocks for the water Cherenkov detection of relativistic charged particles, is currently under construction at various deep-sea locations in the Mediterranean Sea. As inter-domain experiment between neutrino, astroparticle and astrophysics, data processing and data publication from KM3NeT draws on computing paradigms and...
The ability to investigate 3D structure of biomolecules, such as proteins and viruses, is essential in biology and medicine. With the invention of super-bright X-ray free electron lasers (XFELs) the Single Particle Imaging (SPI) approach allows to reconstruct 3D structures from many 2D diffraction images produced in the experiment by X-rays scattered on the biomolecule exposed in different...
The emergence of super-bright light sources - X-ray free electron lasers(XFELs) combined with Single Particle Imaging(SPI) method, makes it possible to obtain nanometer resolution 3D structure of biological particles such as proteins or viruses without needing to freeze them. SPI relies on the “diffraction before destruction” principle, meaning that each sample only produces a single...
The fact that over 2000 programs exist for working with various types of data, including Big Data, makes the issue of flexible storage a quintessential one. Storage can be of various types, including portals, archives, showcases, data bases of different varieties, data clouds and networks. They can have synchronous or asynchronous computer connections. Because the type of data is frequently...
We propose a system for executing low-priority non-parallel jobs on idle supercomputer resources to increase the effective load of the resources. The jobs are executed inside containers so the checkpoint mechanism can be used to save the state of the jobs during the execution and resume it on a different node. Thanks to splitting the execution of the low-priority jobs into separate shorter...
We propose a system to increase the effective load of supercomputer resources. The key idea of the system is that when idle supercomputer nodes appear, low-priority non-parallel jobs are started occupying these nodes until a regular job from the main queue of the supercomputer arrives. Upon arrival of the regular job, the low-priority jobs temporarily interrupt their execution and wait for the...