You are viewing a single comment's thread from:
RE: The future of particle physics - from the Large Hadron Collider to future circular colliders
We are measuring data in inverse barns. For the LHC, we expect 3 inverse ab (attobarns) at the end of the run (per experiment). Here, the goal is 30 inverse ab. Exactly a factor of 10.
Let me try to estimate... (sorry for being curious on computing part) the Tier 0 will receive ~290TB of data a day ( and tier 1 will stay around 29TB /day after filtering), or the experiment is more diluted in time, like taking more time to execute? I am trying to compare with the ALICE experiment (which generated ~4Gb/s If I'm not wrong)
There are researchers investigating this questions (I have only barely followed that development so that I don't have many details to share).
There will be way to much collisions to record them all and one needs to design strategies to decide which one to record and which one to ignore. This needs to make assumptions on the technology that will be available in 20 years. Using Moore's law with respect to what is available today, everything should be fine (I remember this from a talk I attended 2 years ago). All interesting processes (with electroweak rates or less) could be recorded.
I may however need to update myself on the topic. I don't remember having noticed this being discussed at the last FCC workshop at CERN, which may mean this is actually not a problem.
I was not saying it was a problem. I 've read many documents the LHC team released about its architecture, the LHC Grid. It is quite a "uncommon" architecture, but of course I understand you need it.
I was asking a comparison with ALICE because this experiment generated an amount of data which is a well known case study (also) in my sector:
http://aliceinfo.cern.ch/Public/en/Chapter2/Chap2_DAQ.html
Again, I didn't say it is a problem: sure it is a challenge. Very interesting, btw. I like the idea to filter data on the Tier 0 and then trasmit to the Tier 1 to reconstruct. Quite unusual approach, but interesting.
Sorry for the late reply. I am mostly not in front of the computer/cell-phone during the week-end (I am already spending more than 70 hours in front of them during the week :p )
Double sorry for having ignored the ALICE comparison. I just forgot to comment... thanks for insisting. Just to start with, what is said below probably holds for all LHC experiments. Then, just a disclaimer. I want to recall that I am a theorist, so that I will try to answer as good as I can. But this level of technicality is not directly connected to what I do.
The triggers are the key. Not all collisions will be recorded and decisions are made so that only interesting collisions will be recorded and that should satisfy some electronic constraints. The challenges were, at the time the experiment was designed, to be able to have good triggers than guarantee that all interesting events would be recorded, and that the data acquisition system was good enough to follow the rhythm. I think the the challenges have been successfully achieved.
Note that in this chain, you also have Tier 2 and Tier 3 machines so that the analysis work is done on the lower machines that can request some part of the data with some filters.
I hope all of this helps. Do not hesitate to continue the discussion ^^
Don't worry, I was just discussing for my curiosity, so you don't need to apologize for late answer. I am that curious because I work in the private sector, nevertheless your "big data" is a use case , so this is why I ask.
Of course I have the IT perspective, to me the sensors you are using are real-time (=means they cannot wait if the peer is slow) so the challenge is "capacity". But you say the data flow is the same as ALICE, so I think this is because you filter the data using "triggers". Which makes it interesting.
Basically I'm into SOM Kohonen networks and derivatives: self-learning AI which can divide data in "similar" set of data. Be example, if you put economic/demographic/social data of countries , and you ask to put the similar ones with the same color, you see this:
so basically, your "triggers" , which are able to decide what to keep, are very interesting. Ok , you only have 2 categories (to keep and not to keep) , while a SOM may decide how many categories, but still the problem is the same. To put apples with apples and oranges with oranges, without being able to predict which kind of fruit you're going to receive.
I will keep looking for documentation, thank you for your time, your explanations about particle physics are interesting.