Coronavirus, what data to watch to keep an eye on the current situation

And which ones don't tell us much. A recap of the most significant statistical indicators, in the weeks in which all trends indicate an inexorable growth of the new coronavirus Sars-Cov-2

(photo: Bst Agency / Getty Images) Months have passed since the first peak of the Covid-19 epidemic in Italy, yet a certain confusion still seems to resist in the interpretation of the data that institutions (starting with the Civil Protection) communicate daily through the bulletins. Let's take the case of Monday 12 October: +3,689 currently positive, 39 deaths, 891 recovered or discharged, and above all 4,619 new positive cases identified. What do these numbers really tell us?

The first point, which still seems not to have been fully metabolized - not even by many media outlets, judging by the daily headlines - is that it cannot be the single bulletin that provides a indication of how the infection is evolving in our country. As we have told several times here on Wired, the data of the day are actually a cauldron in which information collected with different delays from the different areas of Italy converge, so they photograph a situation a few days back and cannot even be associated with a specific date on the calendar. Furthermore, it is enough to observe the time series to realize that there are weekly periodicities in the data trend, and for example, Monday is almost always the day with lower values ​​than the other days, presumably due to the weekend effect. and the fact that on Saturday and Sunday the activities are less intense. Conversely, Thursday and Friday are the days with relatively higher bulletin numbers.

And if the owner on Monday evening with "declining infections" obviously constitutes non-news, because it is based on a distortion in the method of data collection and not on a real effect in the circulation of the virus, it must also be added that more generally, the variation between one day and the next is never a reliable indication. Indications that can only come from the analysis of the situation in terms of trends, ie looking at the trend of the data, or alternatively working with fixed or moving averages, for example at 3, 5 or 7 days. All the calculations, of course, agree with a progressive growth in the circulation of the virus.

The tampon issue

Although with a certain delay compared to the arrival of the pandemic in Italy, it has now become practice not to limit oneself only to the communication of the number of new positive cases, but to compare this value with the number of swabs performed. Of course, in fact, the more you look for the virus, the more you find it, or at least the percentage of infected people who are intercepted increases. If the now famous percentage positive / swabs ratio is certainly a step forward compared to the number of positives alone, this does not mean that it is in itself an infallible indicator.

First of all because it presents slight anomalies in the days around at the weekend (presumably because certain screening or testing activities to verify the healings are carried out with different intensities), and then because it is a sensitive indicator to the strategy with which the tampons themselves are performed. In fact, if you dabbed people in an absolutely random way, it is clear that the number of positives would be lower than if you were to test close contacts of positive people or, even more so, symptomatic people. Finally, more than the swabs performed, it would be useful to calculate this ratio using the people subjected to the first swab as denominator, subtracting the control ones from the calculation. In perspective, given that it is an emerging issue, it is not even clear how the so-called rapid tests, based on the swab but with antigen tests, will be included in this calculation.

Without prejudice to these clarifications, which require to look at the given a percentage without going too deeply into the decimals, the positive / swab ratio seems however on the whole the best tool available today to evaluate the general circulation of the virus, at least as a rough indication. Here too the trend is that it emerges is one of inexorable growth: in the last month we have in fact gone from values ​​between 1% and 2% to exceed 5% in the last days.

Clinical data

Another way of describing the evolution of the epidemic is to rely only on serious and very serious cases, ie to monitor hospital admissions of symptomatic people, admissions to intensive care and deaths. Given that in many cases Covid-19 is not the only pathology from which patients are affected, but that the distinction between deaths from and deaths with Covid-19 is nonsense, the trend of these data over time is a clear indication of the situation, also because it allows a comparison (more easily than with other types of numbers) to make a comparison with the situation of one or more months ago.

The most emblematic data is probably that of hospitalizations in intensive care , which after the minimum at an altitude of 38 in the height of summer have gradually risen up to the current over 450. And the same goes for deaths, on average less than 10 per day in July and August and now rising (not a lot, for now) to around 30. Even for this data, however, it is worth relying on somewhat more robust statistical tools - such as a banal weekly average - rather than thinking about daily updates.


Despite the circulation of the virus in recent weeks is much more distributed over the territory than it was in March and April, evaluating only the national data can lead to some distortions. In fact, potentially critical situations are those areas where the concentration of cases is highest, and where it will be possible to intervene with more rigorous measures. Together with the overall data, therefore, it may be sensible to go into regional or even provincial detail of the trends.

The look could therefore be from two perspectives: one global, which focuses on the overall situation of the pandemic, the other local, focused on the actual circulation of the virus in specific urban areas. National data are an intermediate way between the two, they have their own value but do not necessarily represent the best scale to describe what is happening.

Watch out for comparisons

So much on social media as in the public, political and media debate comparisons between situations are frequent. Between Italy and other European countries, between today and last spring, between Covid-19 and other diseases, just to name some of the most talked about. Up to, to the extreme, attempts to compare the probability of dying from the new coronavirus and that of being hit by an asteroid.

However, even leaving out the crazy arguments, many of these comparisons are little sense both from a logical and a statistical point of view. For example, when it comes to the counting of positive cases, the strategies with which the monitoring is carried out on the territory are decisive. This means that both at a geographical level (especially between country and country) and at a temporal level, the comparisons can be distorted by these differences: the approximately 5 thousand cases of new positivity today are not comparable with those that occurred at the beginning of March. br>
A purely conceptual, but decisive, comparison is that between corresponding periods of different years. At the moment it is an impossible comparison, because a year ago the data (at least the official ones) were all zero, but the fact that already in mid-October we are dealing with a large number of cases and with a significant growth raises some concern. The fear, in fact, is that the cold season could generate a dynamic similar to that of seasonal flu viruses, and in this sense today we would be just at the beginning of the wave of infections. This type of comparisons will probably be talked about a lot in the future, because they represent a way of understanding how the circulation of the virus is changing from one year to the next, adjusting prevention behaviors accordingly.

