About Our Data

Published by
The Arkivist
Published on
April 08, 2021

Subscribe to Receive Our Weekly Call To Action by Email

(Check your spam box if you do not see the confirmation right away.)

Please enable the javascript to submit this form

In my experience, few doctors actually know VAERS exists and if they do only thought it was for a "rare" anaphylactic reaction. Most people do not know about it either. VAERS is getting a lot more attention during COVID due to the lack of any other public and functioning government tracking system. You can read more about VAERS here: vaers.hhs.gov/about.html

I tried to search the VAERS data and found the interface frustrating and dinosaur-like. If as a developer I found it difficult to use, how could anyone hope to really get data from it without spending a day learning the ins and outs? I turned my attention to asking what did people most want out of VAERS and how would it be best delivered (something I do in my job regularly). I decided what many want is to read the story that accompanies each record without getting super complicated in their queries. They wanted to answer relatively simple questions about how many died? How many had what my kid had? I set out to build a friendly interface for people to do that.

We use the readily available data dumps that VAERS provides to build our data. These dumps used to update monthly from VAERS but since COVID VAERS now provides a weekly update. Each year needs to be downloaded separately. Each year's file contains 3 CSV files. These are relational databases that have to be hooked back together in order to build a single datafile for the web. So 3 files x 31 years = 90 files and then there is a single foreign records table containing 90,000+ records from all 31 years in one. Each file set has to be downloaded separately. We decided before COVID our updates would happen yearly as there was no way to do this every month. With the COVID changes, people (and us) want up-to-date data. So we are now updating weekly just 2020 and 2021 and will continue to do the entire fileset once per year.

VAERS data is difficult to work with. For example, VAERS has 6 date fields to search by but only one was a required field in the record creation. This means that if you search deaths by a year in any other date field (example Date of Death) you will get an undercount of total deaths. But, if you use the one field every record has (RCVDATE) then you will not be able to consistently chart to a date. Meaning if you are looking for how deaths progress by year in relation to an historical event (say the introduction of a vaccine), your data will be faulty because the record can be received ANYTIME after the injury. This is part of why VAERS is unreliable for precise data analysis, I imagine.

Another issue is naming conventions and search terms. Do you want Bell's Palsy? Well, you need to search "Facial Paralysis" in the Symptoms field. But, what about numbness, tingling, parasthesia? That could show up in the description not the symptom list because someone with symptoms of Bell's Palsy may not have had it entered as an "official" symptom. This is also true of Anaphylaxis. If you search Anaphylaxis you get a result. However, if you search for throat itchiness and hives your true cases of severe allergy increase exponentially.

VAERS managers also can edit any record at anytime and also delete records. In the last month approximately 50 records came and then went in the COVID data we discovered. Looking briefly at the missing records I am at a loss to explain why most were removed, there are a few that I could see were not strong but overall... they just look like all the other reports. I can think  of quite a few reasons records disappear from the dump (I suspect they are still in the VAERS database and unpublished), some nefarious, some more oriented towards issues with data and duplicates. I don't have the ability to figure this out without access to VAERS. We decided to keep our data synced with VAERS so that there was no question that our data was real. We backed up the older database with the excess records and will continue to back up with each weekly iteration, but we do not have the resources to track these changes.

OpenVaers is a work in progress, so from time to time, as happened today the site will turn off for maintenance and then restart. We hope to keep this to a minimum.


Questions? Comments? Bugs?

But PLEASE read the FAQ first.