How Valid are the Reported Cases of People Infected with Covid-19 in the World?

Raul Isea

doi:10.14302/issn.2692-1537.ijcv-20-3376

Full Text Article Abstract Introduction Results Conclusions Acknowledgment

References Cited by ( 14 )

Review Article

How Valid are the Reported Cases of People Infected with Covid-19 in the World?

Open Access
Peer Reviewed
DOI
Similarity Checked
CC BY 4.0

Raul Isea¹

¹Fundacion IDEA, Hoyo de la Puerta, Baruta, Venezuela

Abstract

The goal of this paper is to analyze the registered cases of people who have been infected with Covid-19 registered from throughout the world, using a digital forensic analysis technique that is based on Benford's Law. Twenty-three countries were randomly chosen for this analysis: China, India, Germany, Brazil, Venezuela, Netherlands, Italy, Colombia, Russia, Norway, South Africa, Portugal, Singapore, United Kingdom, Chile, Ecuador, Egypt, Denmark, Ireland, France, Belgium, Australia and Croatia.. We calculate on the p-values based on Pearson χ² and Mantissa Arc Test according to the results obtained with the first digit. If any country fails these two tests, a third proof will be carried out based on the Freedman-Watson test. The results indicated that results from Italy, Portugal, Netherlands, United Kingdom, Denmark, Belgium and Chile are suspicions of data manipulation because the numbers fail the Benford’s Law according to the results obtained until April 30, 2020. However, it is necessary to carry out further studies in these countries in order to ensure that they countries manipulate or altered the information.

Article Information

Received11 May 2020
Accepted26 May 2020
Published28 May 2020

Journal

International Journal of Coronaviruses

Volume / Issue

Vol 1, Issue 2

Pages

53–56

ISSN

2692-1537

Type

Review Article

DOI

10.14302/issn.2692-1537.ijcv-20-3376

Published

28 May 2020

Academic Editor: Sasho Stoleski, Institute of Occupational Health of R. Macedonia, WHO CC and Ga2len CC, Macedonia

Checked for plagiarism: Yes

Review by: Single-blind

License

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Corresponding author: Raul Isea, Fundación IDEA, Hoyo, de la Puerta, Baruta, Venezuela —

Competing Interests

The authors have declared that no competing interests exist.

Funding

No specific funding statement was provided by the authors.

Data Availability

No data-availability statement was provided by the authors.

Acknowledgements

I’d like to acknowledgment to Karl E. Longreen for your comments in this manuscript.

Citation:

Raul Isea (2020) How Valid are the Reported Cases of People Infected with Covid-19 in the World?. International Journal of Coronaviruses - 1(2):53-56. https://doi.org/10.14302/issn.2692-1537.ijcv-20-3376

Download as RIS, BibTeX, EndNote, Text (Include abstract )

DOI 10.14302/issn.2692-1537.ijcv-20-3376

Introduction

In December 2019, the first cases of a new coronavirus (2019-nCoV) responsible for atypical pneumonia began to be registered in Wuhan (China). As of April 30, there are more than three million people infected individuals and there have been almost 230,000 deaths in 180 countries throughout the world. For that reason, On March 11, the disease was declared a pandemic by the World Health Organization.

There is currently no vaccine against this disease, and social distancing measures have been the main recommendation of the World Health Organization to prevent the spread of this disease. Recently, a study (written in Spanish) based on differential equations that simulate the transmission dynamics of the disease was presented from the reported cases of infection in four different countries, according to data recorded at Johns Hopkins University ¹. This paper concludes/indicates that the success of the model will depend on the quality of the data.

For this reason, it is necessary to validate the data obtained from the infected cases of Covid-19, and thus, we can indicate that the data have not been altered or manipulated or even poorly transcribed for unknown reasons. Remember that the Benford's Law has been used in various scenarios to detect, for example, fraud in campaign finances ², Governmental Economics data ³, in account data ⁴, fraud in scientific data ⁵, among others ^6,⁷.

In the scientific literature, we only found one paper published in a repository (arXiv) where the author studied the first contagion outbreaks occurred in China until February 13, 2020 using Benford's Law ⁸. This manuscript concluded that until this date, there was no evidence of alteration or manipulation of the cases registered in China.

For this reason, we carry out a more complete study to determine if it is possible to validate the data of people infected by covid-19 using Benford's Law based on Pearson χ² and the Mantissa Arc Test, and eventually, the Freedman-Watson test to verify that the data has not been manipulated.

Computational Methodology

The data of infected cases were obtained in the database John Hopkins University (available at coronavirus.jhu.edu), from December 31, 2019 to April 30, 2020. The next step was to determine the frequency of appearance of the first digit according to Benford’s Law. In order to do that, we employed an algorithm in R employed the library: Benford.analysis according to the following equation:

Download figure

where i corresponds to the values that go from 1 to 9 see details in 9. With this distribution, we calculate the Pearson value X², which means the goodness of fit statistics according to this equation:

Download figure

where P(k) and b(k) are the proportions obtained from the data and the Benford’s Law, respectively. The p-value is simply the probability obtained according to random values as explained in ⁹, where the p-value should be greater than 0,05 which implied that the numbers have not been altered or manipulated. In addition, the Pearson value χ² should tend to zero.

In the Mantissa Arc Test, itwas necessary to calculate a center of mass of the set of values obtained from the mantissa values when considering that the data is distributed in a unit circle, where the center of the circle is given by:

Download figure

where x₁, x₂, …, x_Nare the data values.

The next step is to determine the length of the mean values L²,which is given as

Download figure

And finally, the p-value is simply.

Download figure

Finally, to verify if any country really fails Benford's Law, we will verify with a third test called the Freedman-Watson ¹⁰, which is based on the following equation:

Download figure

but this equation is complicated to explain and see details in ¹⁰.

And remember that the p-value should be greater than 0,05 that indicates that the data has not been altered or manipulated.

Finally, the calculations were carried out for twenty-three countries: from 29 December, 2019 until April 30, 2020: China, India, Germany, Brazil, Venezuela, Netherlands, Italy, Colombia, Russia, Norway, South Africa, Portugal, Singapore, United Kingdom, Chile, Ecuador, Egypt, Denmark, Ireland, France, Belgium, Australia and Croatia, and the results are explained in the next section.

Results

In Table 1, we summarize the results that have been obtained with the two tests according to the data obtained up to April 30, 2020. The results were grouped random into three blocks, where the number of degree of freedom in the Pearson χ² and Mantissa Arc Test were 8 and 2, respectively. In addition, we indicate the number of data points by each country (the results were verified with other module of R called BenfordTest).

Table 1. Results obtained according to Benford’s law (see text for more details).

		China		Italy		Brazil				Colombia			Venezuela			India			Russia
X²		3,450		33,383		6,785				16,974			8,557			12,560			22,709
S. size		109		71		58				52			34			62			54
p-value (X²)		0,903		10^-5		0,560				0,030			0,381			0,128			0,004
p-value (Mantissa)		0,522		10^-6		0,354				0,061			0,868			0,002			0,118

	Germany			Norway			S. Africa		Portugal			Singapore			Netherlands			UK			Chile
X²	12,425			7,952			6,619		16,623			4,373			22,725			55,074			26,363
S. size	75			63			54		60			91			64			70			58
p-value (X²)	0,133			0,438			0,578		0,034			0,822			0,003			10^-6			10^-4
p-value (Man)	0,386			0,331			0,372		0,004			0,935			10^-8			10^-6			0,001

	Ecuador		Egypt		Denmark			Ireland			France			Belgium			Australia			Croatia
X²	9,408		10,194		25,535			9,174			14,025			24,605			5,011			7,868
S. size	55		54		64			59			72			62			77			62
p-value (X²)	0,309		0,252		0,001			0,328			0,081			0,002			0,756			0,447
p-value (Man)	0,557		0,142		10^-4			0,167			0,139			0,003			0,445			0,001

The countries that pass the two tests which means that the p-value greater than 0,05, are China, Germany, Brazil, Venezuela, Norway, South Africa, Singapore, Ecuador, Egypt, Ireland, France and Australia. This means that the information these countries is valid. In fact, China, Singapore and Australia perfectly are agreed with the Benford's Law. On the other hand, Colombia, India, Russia and Croatia pass at least one of the two tests as shown in Table 1, so these countries no manipulate the data.

However, Italy, Portugal, Netherlands, United Kingdom, Denmark, Belgium and Chile do not pass either of the two tests (their values have been highlighted and in red color in the Table 1). For these countries, we calculate the p-value according to the Freedman-Watson test (employed the Benford.analysis library), and the results obtained were: 10^-3, 10^-16, 10^-4, 10^-16, 10^-10, 10^-16, 10^-⁴, correspondent to Italy, Portugal, Netherlands, United Kingdom, Denmark, Belgium and Chile, respectively. Therefore, three tests different indicated that these countries may have somewhat or altered the data, because it is not possible to verify their accuracy with these three different tests.

However, it is necessary to wait until the end of the pandemic to be able to analyze all the data and to ensure that these countries have been able to manipulate the data, or perhaps there are failures due to the omission of registered cases.

Conclusions

The results obtained from the analysis based on Benford's Law of infected cases with Covid-19 obtained that China, Germany, Brazil, Venezuela, Norway, South Africa, Singapore, Ecuador, Egypt, Ireland, France, Australia, Colombia, India, Russia, Croatia don’t manipulate the information register in the Jonhs Hopking dataset. However, Italy, Portugal, Netherlands, United Kingdom, Denmark, Belgium and Chile do not pass three tests carried out in the paper, and therefore, it is necessary to carry out further studies in these countries in order to ensure that they countries manipulate or altered the information.

In fact, we consider that we must wait until the end of the pandemic until all cases have been registered in all countries, and thus we must ensure the lack of credibility of the data provided in a given country in the world.

References

1.Isea R. (2020) La dinámica de transmisión del Covid-19 desde una perspectiva matemática. Revista del Observador del Conocimiento. 5(1), 15-23.
Google Scholar

2.Cho W, Gaines B. (2007) Breaking the (Benford) Law: statistical fraud detection in campaign finance.The. , American Statistician 61(3), 218-223.
Google Scholar

3.Rauch B, Gottsche M, Engel S. (2011) Fact and Fiction. in EU-Governmental Economics data.German Economics Review 12(3), 243-255.
Google Scholar

4.Durtschi C, Hillison W, Pacini W. (2004) The effective use of Benford’s Lawto assist in detection fraud in accounting data.J.ForesicAccounting. 5, 17-34.
Google Scholar

5.Diekman A. (2007) Not the first digit! Using Benford’s Law to detect fraudulent scientific data.J. , Appl Stat 34(3), 321-329.
Google Scholar

6.A K Forman. (2010) The Newcomb-Benford Law in its relation to some common distributions.PlOSONE. 5, 10541.
Google Scholar

7.Pietronero L, Tossati V, Vespignant A. (2001) Explaining the uneven distribution of number in nature: the Benford and Zipl.PhysicaA,293(1-2):. 297-304.
View Article Semantic Scholar Google Scholar

8.Zhang J. (2020) Testimg case number of coronavirus disease. in China with Newcomb-Benford Law. Respository arXiv.ID: 2002-05695.
Google Scholar

9.J N Nigrini.Benford’s Law.Applications for Forensic Accounting, Auditing, and Fraud Detection. , Inc.2012. New Jersey
Google Scholar

10.L S Freedman. (1981) Watson's Un2 Statistic for a Discrete Distribution. , Biometrika 68, 708-711.
View Article Google Scholar

Cited by (14)

This article has been cited by 14 scholarly works according to:

OpenAlex 11 citations Crossref 9 citations Semantic Scholar 10 citations

Citing Articles:

1.How Chinese local governments respond to competing targets: Evidence from the COVID-19 epidemic
Applied Economics (2025) Crossref Semantic Scholar OpenAlex

2.The perils of premature evaluation: reassessing the application of Benford’s Law to the USA’s COVID-19 data
R. Dutta-Powell - Statistics in Transition New Series (2025) Semantic Scholar

3.Epidemiological anomaly detection in Philippine public health surveillance data through Newcomb-Benford analysis
Journal of Public Health (2024) Crossref OpenAlex

4.Epidemiological anomaly detection in Philippine public health surveillance data through Newcomb-Benford analysis.
S.J E. Parreño - Journal of public health (2024) Semantic Scholar

5.The perils of premature evaluation: reassessing the application of Benford’s Law to the USA’s COVID-19 data
Research Square (Research Square) (2024) OpenAlex

6.The limits of conformity analysis under the Newcomb-Benford law and the COVID-19 pandemic in Brazil
Carlos Roberto Souza Carmo, F. C. Nunes, F. L. Caneppele - Brazilian Journal of Biometrics (2023) Semantic Scholar OpenAlex

7.Benford Law to Monitor COVID-19 Registration Data. Comment on Farhadi, N.; Lahooti, H. Forensic Analysis of COVID-19 Data from 198 Countries Two Years after the Pandemic Outbreak. COVID 2022, 2, 472–484
Francisco Morillas-Jurado, M. Caballer-Tarazona, V. Caballer-Tarazona - COVID (2022) Semantic Scholar OpenAlex Crossref

8.COVID-19 DATA RELIABILITY RANKING OF COUNTRIES WITH GREY RELATIONAL ANALYSIS AND BENFORD’S LAW / Gri İlişkisel Analiz Ve Benford Yasası Yardımıyla Ülkelerin Covid-19 Veri Güvenirliği Sıralaması
Uluslararası Ekonomi İşletme ve Politika Dergisi (2022) Crossref OpenAlex

9.Forensic Analysis of COVID-19 Data from 198 Countries Two Years after the Pandemic Outbreak
COVID (2022) Crossref Semantic Scholar OpenAlex

10.An Analysis of the Reliability of Reported COVID-19 Data in Western Balkan Countries
Advances in Science, Technology and Engineering Systems Journal (2021) Crossref Semantic Scholar OpenAlex

11.Are COVID-19 Data Reliable? A Quantitative Analysis of Pandemic Data from 182 Countries
Noah Farhadi, H. Lahooti - COVID (2021) Semantic Scholar OpenAlex Crossref

12.Pandemic Growth and Benfordness: Empirical Evidence from 176 Countries Worldwide
Noah Farhadi, H. Lahooti - COVID (2021) Semantic Scholar OpenAlex Crossref

13.Was there voter fraud in the 2021 Peru Presidential Elections?
Journal of Model Based Research (2021) OpenAlex Crossref

14.A quick Look at the Registered Cases of Covid-19 Throughout the World
R. Isea, K. Lonngren - International Journal of Coronaviruses (2020) Semantic Scholar

[1] 1.Isea R. (2020) La dinámica de transmisión del Covid-19 desde una perspectiva matemática. Revista del Observador del Conocimiento. 5(1), 15-23.
Google Scholar

How Valid are the Reported Cases of People Infected with Covid-19 in the World?

Abstract

Article Information

Competing Interests

Funding

Data Availability

Acknowledgements

Introduction

Computational Methodology

Results

Conclusions

References

Cited by (14)

Publisher's Note

Article Details and Related Research

Related Topics

Related Research From This Journal