How we count at OpenVAERS
Subscribe to Receive Our Weekly Call To Action by Email
There are many approaches to analyzing the VAERS data and there is no one right way. In part this is due to the audience one is speaking to and in part to the expertise of the particular researcher. At OpenVAERS the mission is making the raw data accessible. Our focus is on user centered design and programming. Thus, we give you the counts of numbers that the CDC provides. Deaths are counted as Died=Yes, Age is retrieved from the AGE_YRS field. (There are actually two age fields AND they don’t always agree, but we use the one that the Wonder site uses).
The VAERS data is complicated to explain even to an audience made up of scientists because, in order to talk about specifics, one needs to first describe the underlying architecture. One cannot do this in 10 minutes. Best guess is that it would take about 45 minutes to start a newbie off with a basic understanding of the VAERS data.
OpenVAERS uses the fields the CDC uses. People like Dr. Jessica Rose also use the CDC fields. In this way our numbers are defensible and comparable. They can match similar queries made on Wonder.
Others will dive into the Symptom_Text field or the History field etc. These fields were more useful before the European data was removed from the export we get weekly from CDC. The problem is that the CDC does NOT use these fields in their counts.
Using these fields to mine for numbers / reports can be interesting but problematic. In order to do it on a large scale, search algorithms must be written and these results are only as good as the algorithm. When you mine the data from the text fields you get some good data for sure, but you also get problems. With the Age field for example, you may search for “XX years old” but that isn’t always a reference to the age of the injured party. Sometimes the narrative mentions the age of a relative or surviving children who were not the subject of the report. More problems with mining age come from the fact that it can pull up reports that are made in passing… example: “A consumer reported that they had a stroke and that they had a 70 year old friend who had one too.” More problems come in from the dual age fields that sometimes don’t match. Which is correct? CDC clearly thinks the AGE field they use is (rather than the Symptom_text field or the History field).
A large percentage of "unknown age" reports is a problem. Any blank field is a problem. You can see the effect of the unknowns on some charts throughout our site. The CDC could easily fix these problems through more effective data collection and follow up. But the CDC has had 30 years to address these problems and the question is why haven’t they?
When we refer to undercounting of VAERS reports, this issue of incomplete field data factors into the equation. Yes there are definitely more reports that are positive for any given criteria, but there are also (and usually an equal number of) reports that will be gathered that are not positive hits. And, without the original files and information that the CDC keeps to itself, it’s often not possible to be certain. Remember, the CDC keeps two sets of books on vaccine injury and the VAERS data we get is the doctored set of books. Obsessing over the exact number of any specific adverse event in a set of inexact data is a little like trying to count the stars in the sky. There are just a lot. Given that, the best we can do is make broad generalizations based on comparisons with what happened before and say there is huge a problem here.