Systematic Pharmacovigilance Signal Analysis of Migraine Preventives in Adolescent Females
December 2025 - Current
Adolescent migraine prophylaxis operates on an evidence base that was not built for adolescents. Most preventive therapies were evaluated in adult trials with median participant ages well above the teenage years, and those studies establish broad efficacy and safety profiles. But they weren’t designed to capture how adverse effects show up in younger patients, especially those with early-onset or treatment resistant conditions.
The limitation becomes quite obvious when treatment is iterative. Patients who fail initial therapies or stop because of intolerance often go through multiple medication transitions. In these cases, adverse effects are, rather than “side effects”, frequently the deciding factor in whether a therapy is viable at all. Despite this, existing pharmacovigilance frameworks do not systematically model how those effects differ by age, sex, or genetic background.
SigVigil is the third component of the Migraine Stratification Outcomes Framework (MiSOF). ChanVar structures variable representations and TraitStrata defines phenotypic subgroups. SigVigil evaluates how migraine preventives interact with these subgroups at the level of adverse event signals. NeuroTrack will then extend this into longitudinal treatment trajectory modeling.
The analysis targets 6 major drug classes: antiepileptics (topiramate, valproate), tricyclic antidepressants (amitriptyline), beta blockers (propranolol), angiotensin receptor blockers (candesartan), and CGRP monoclonal antibodies (erenumab, fremanezumab, galcanezumab). The central question is not simply which adverse effects are associated with these drugs but further how those associations shift when examined within specific demographics.
Cheers,
Angie X.
*Note: this project is not fully polished yet. Sorry if there are discrepancies or content gaps- I'm working on it!
Clinical trials tend to prioritize internal validity. Participants are selected to reduce variability. Comorbidities are controlled and observation periods are bounded. These constraints are generally necessary methods-wise, but they limit how well trial results generalize. People who fall outside those constraints will still receive the same medications, often guided by evidence derived from groups that differ statistically-significantly from them.
Post market pharmacovigilance systems address this gap. FAERS, the FDA Adverse Event Reporting System, aggregates reports from patients, clinicians, and manufacturers after drugs enter widespread use. The public database now contains tens of millions of reports spanning several decades. It suffers from underreporting and reporting bias but provides the only large-scale view of how drugs behave outside controlled environments.
SigVigil implements 4 disproportionality methods from first principles: Reporting Odds Ratio (ROR), Proportional Reporting Ratio (PRR), Information Component (IC), and Empirical Bayes Geometric Mean (EBGM). Each one of these methods captures a different aspect of signal strength and uncertainty. Combined, they far more comprehensive view than any single metric alone. The SigVigil pipeline also includes stratified analyses by age and sex, sensitivity tests, and multiple testing correction.
I started with a straightforward literature search regarding the adverse effects of migraine preventives specific to adolescent females. However, after a while of searching, the pattern was pretty clear. Almost nothing. And not because the adverse effects don’t exist, but because the studies that would characterize them systematically haven’t even been done.
The pharmacoepidemiology literature on topiramate, valproate, and the CGRP antibodies focuses on adults. The handful of pediatric studies that do exist are single arm observational studies with small samples and no comparative framework. No paper systematically analyzes FAERS for the adolescent female subgroup across multiple drug classes.
The reason for the gap is not complicated. Drug trials are horribly expensive. Pediatric trials are even more expensive, slower to enroll, and face additional regulatory and ethical hurdles. CGRP antibodies were only approved for migraine at all in 2018. The real world post-market data that would characterize their behavior in adolescent females simply has not accumulated for long enough to produce a stable-enough signal in most analyses.
There is also a subtler issue: FAERS does not actively recruit reporters. It collects reports from whoever chooses to submit them. Adolescents are less likely than adults to self report adverse effects. Obviously, parents may report on their behalf, but this occurs less consistently. Additionally, physicians in adolescent practices may have different reporting rates than adult care physicians.
I am not saying that SigVigil corrects for that bias. Unfortunately, it cannot. What it can do is make the existing signal, however attenuated by underreporting, much more visible + analyzable in an open framework that any researcher can extend or critique.
Building SigVigil required learning the ICH guidelines, the MedDRA hierarchy, the mechanics of spontaneous reporting systems, and the statistical literature on disproportionality analysis, and in parallel with building the actual code.
Here are a few things I found interesting.
The first was that the FDA has been using Bayesian methods for drug safety signal detection since 1999. The DuMouchel MGPS paper, the basis for SigVigil's EBGM implementation, is twenty five years old. This surprised me since the current discourse around AI/medicine tends to present Bayesian or probabilistic reasoning as somehow novel or cutting edge. Meanwhile, a government agency was implementing two component Gamma mixture priors and computing posterior geometric means over millions of drug event pairs a quarter century ago, without any of the infrastructure that makes the kind of computation easy today. Pretty nifty.
The second was the FAERS data engineering problem. The raw data is 25 million records across quarterly ASCII files going back to the late 1990s. Drug names are free text. The same drug appears as 'TOPIRAMATE', 'Topamax', 'topiramate 100MG TABLET', 'TOPIRAMATE 25 MG ORAL TABLET', 'Topi.', and several hundred other variants. Pre 2012 files use a different primary key than post 2012 files and some quarterly files have encoding issues. The deduplication problem, where the same adverse event may be reported multiple times by different parties, requires heuristic approaches because FAERS does not provide a true ground truth for what constitutes a unique case. I spent more time on data cleaning than on any single statistical method. This was an issue that would later arise in NeuroTrack as well, and I similarly documented it in Phase 1.
The third was the distinction between a signal and a cause. Disproportionality analysis detects over representation in a database but doesn’t detect causation. An elevated ROR for (valproate, amenorrhoea) means that FAERS reports mentioning valproate are disproportionately likely to also mention amenorrhoea. Statistically speaking, this could mean valproate causes amenorrhoea, but it could equally mean that physicians prescribing valproate are more attentive to reproductive adverse effects than physicians prescribing other drugs, and therefore more likely to report them. It could also mean that the population on valproate has underlying characteristics that predispose them to menstrual irregularity independent of the drug. The signal is real, yet the causal interpretation requires clinical judgment and, where warranted, formal epidemiological follow-ups.
I found this distinction hard to hold onto consistently while writing analysis code. This has always been a personal struggle of mine: in my mind, I often like to characterize things into precise numbers, code, and equations and assume its perfect extrapolation into reality. However, I had to continue reminding myself that despite such precise computations, the causal questions remained. Every number in SigVigil is a ratio of frequencies in a biased, voluntary-reporting database and making sure not to forget this is an important scientific virtue.
The SigVigil library has 4 statistical modules, each implementing one method from scratch.
The ROR is structurally the simplest and is merely a ratio of odds from a 2x2 contingency table, with a log normal confidence interval and a Haldane-Anscombe correction for zero cells. This took only about an afternoon, including the unit tests.
The IC took longer. The variance approximation, the piece that allows you to compute the lower credible bound IC025, is in a 2006 paper that itself cites a 1998 paper. The relationship between them is not immediately obvious. The key insight here is that the IC variance is a function of both n_de and N (total cases), not just n_de. For very large N, even a small n_de gives a relatively tight IC025 because the expected count E_de is well estimated. For small N, uncertainty about E_de propagates into uncertainty about IC. Working through this from first principles took longer than I expected, so the unit tests for the IC module were the most careful I wrote.
The EBGM was the hardest. The DuMouchel 1999 algorithm requires fitting a 5 parameter model by maximum likelihood across all drug event pairs in the database. Not for the specific pair you are analyzing, but for the entire background distribution. This is the 'empirical' part of empirical Bayes. The prior is learned from the data, the posterior for each specific pair is then computed analytically using the gamma Poisson conjugate structure, the MLE fitting uses L BFGS B via scipy.optimize.minimize, and the posterior computation uses the digamma function for the geometric mean and Brent's method for the EB05 (5th percentile). Holy run-on sentence.
The piece I was most uncertain about was the EB05 computation. The DuMouchel paper gives the EBGM as the geometric mean of the posterior, but the EB05 requires inverting the CDF of a mixture of two Gamma distributions. There is no closed form for this, so you have to solve it numerically. I decided to use Brent's method with the mixture CDF, bounded between 1e-6 and 10 times EBGM. This actually works pretty well for the range of values in FAERS. The unit tests include a check that EB05 is less than EBGM for all tested inputs. This is guaranteed by definition and therefore a useful sanity check that the numerical CDF inversion is working correctly.
The multiple testing module implements 3 corrections. Bonferroni is trivial and Benjamini Hochberg requires getting the running minimum right. The q value for the ith ranked p value is the minimum over all j ≥ i of (K/j) × p(j), not just the ith term. Getting the direction of this inequality wrong, a common error, produces q values that are not monotone, which then violates the BH procedure. The Storey q value adds pi0 estimation via cubic spline fit across a grid of lambda values. This introduces an additional numerical subtlety. The spline extrapolated to lambda = 1 should be clipped to the minimum observed pi0 estimate rather than allowed to extrapolate freely. I also verified all 3 implementations against known examples from the original papers.
The data engineering, the downloader, parser, deduplicator, and drug name normalizer, is more lines of code than all the statistical modules combined. RapidFuzz handles fuzzy drug name matching at configurable threshold and the deduplicator applies the FDA recommended version deduplication (keep latest case version per case ID) followed by a 30 day heuristic pass for likely duplicates. I would say that none of this is glamorous work, but is the difference between an analysis that runs on the actual data and one that runs on a toy dataset.
The full SigVigil analysis ran on 15,138 deduplicated cases: adolescent females aged 10–21 with migraine-related drug exposure in FAERS 2004-2024. Of 144 drug-event pairs tested, 27 were flagged by at least one method and 9 by three or more methods simultaneously. This is a meaningful result set.
Pre-specified Hypotheses:
Topiramate weight decreased (n=68, ROR 3.65, IC025 +1.08, EBGM 2.63) and topiramate decreased appetite (n=51, ROR 2.06, IC025 +0.44) both confirmed by all four methods with p ≈ 0. The anorexigenic signal for topiramate in adolescent females is unambiguous and survives every sensitivity restriction applied. Valproate alopecia and amitriptyline sleep disorder did not clear signal thresholds. This is not evidence of absence; the n for those pairs (26 and 6, respectively) is insufficient for IC025 to cross zero given the background rate. The hypotheses remain scientifically valid.
Primary Findings:
The two strongest signals in the entire dataset are topiramate cognitive disorder (n=25, ROR 7.88, IC025 +1.44, EBGM 3.77) and topiramate memory impairment (n=34, ROR 4.33, IC025 +1.09, EBGM 2.86). Both clear all four methods. EBGM EB05 ≥ 2 (the most conservative criterion) holds for both, meaning even the lower fifth percentile of the Bayesian posterior exceeds the signal threshold. These are the highest-confidence findings in the analysis. Topiramate's cognitive and memory adverse effects in adolescent females are over-represented at a scale that exceeds the weight loss signal in absolute magnitude, despite receiving less clinical attention.
Amitriptyline depression (n=44, ROR 2.42) and amitriptyline anxiety (n=49, ROR 2.42) both clear three methods with EBGM approximately 2.08. Propranolol paraesthesia, depression, and suicidal ideation all clear three methods. Valproate cognitive disorder (ROR 2.65) and valproate menstrual disorder (ROR 3.62) are three-method signals. Erenumab alopecia (n=13, ROR 2.18) is a three-method signal and represents a relatively novel finding for the CGRP class.
Sensitivity Analysis:
The topiramate cognitive and memory signals hold across all five sensitivity variants with zero degradation: serious-only, HCP-only, post-2015, primary-suspect, and ex-top-3-countries. Signal concordance across all five restrictions is the strongest form of validation available from spontaneous reporting data. Signals that collapse under data restrictions are artifacts; signals that don't are telling you something about the drug.
Stratified Comparison (Figure 3):
Valproate menstrual disorder shows the largest amplification delta (Δ = +0.90): substantially stronger in adolescent females than in the general FAERS population on valproate. Topiramate cognitive disorder (Δ = +0.37) and propranolol paraesthesia (Δ = +0.25) are also amplified. Most CGRP antibody pairs show negative deltas, and their signals are actually attenuated in adolescent females relative to the general population, consistent with the pre-specified hypothesis about CGRP mAbs carrying lower adverse event burden than antiepileptics.
CGRP Antibodies:
Erenumab somnolence has ROR 0.03, strongly negative, meaning erenumab is under-represented for somnolence relative to the background. Across metabolic, neurological, and psychiatric categories, erenumab, fremanezumab, and galcanezumab show a dramatically quieter adverse event profile than topiramate and valproate. This is not a null result; it is the clearest comparative finding in the analysis, and it is clinically actionable: for an adolescent female where cognitive or weight-related adverse effects are a concern, the pharmacovigilance signal landscape supports preferring a CGRP antibody over topiramate when clinical criteria are met.
EBGM vs. ROR (Figure 5):
The topiramate scatter shows the expected pattern. Cognitive disorder sits at ROR 7.88 with EBGM 3.77, which is close to the identity line because n=25 is sufficient for the prior's pull to diminish. Memory impairment and weight decreased cluster below the identity line, reflecting modest Bayesian shrinkage. All points remain below EBGM = 2 except cognitive disorder and memory impairment. The prior is doing what it should: amplifying confidence in high-n signals and tempering confidence in low-n ones.
Overall: The SigVigil analysis produced real, cross-validated findings from 20 years of FAERS data. The strongest result of topiramate's cognitive and memory burden in adolescent females is more pronounced than the weight loss signal the field has focused on clinically. Valproate menstrual disorder shows population-specific amplification. The CGRP antibodies show systematically lower signal burden across the board. These are not artifacts of a small dataset or a single statistical method. They are consistent findings across four independent measures, five sensitivity analyses, and two decades of reporting.
FAERS has been public since 2012 and the methods to analyze it are in papers published between 1995 and 2003. The data necessary to reproduce every result in this project costs nothing and yet, as of 2026, there was no open source Python library that implements the full disproportionality analysis suite for spontaneous reporting data with the methodological completeness of SigVigil.
There are OpenFDA API wrappers that let you query aggregate counts as well as commercial pharmacovigilance platforms used by pharmaceutical companies and regulatory agencies and R packages with partial implementations. None of them combine public FAERS preprocessing, all four disproportionality methods, Storey q value, stratified population comparison, sensitivity analysis suite, and a structured output format suitable for inclusion in a research paper.
As I always say, my SigVigil framework isn’t some production-grade regulatory software. Perhaps you’re sick of my spiel, but I have to continuously reiterate this lest someone forget and attempt to sue me: I’m just a young lass. Aka a high school student with no wet lab/institution access.
The deeper observation is about what open access to methodology enables. The clinical question, whether migraine preventives have different adverse effect profiles in adolescent females than in the general population, will not be answered by a single paper or a single database query. More studies are necessary, and SigVigil is (I hope) a slice of the infrastructure that makes the program more accessible to researchers.
The gap between clinical evidence and real world experience is a fixture of how medical knowledge is generated. Pharmacovigilance provides a complementary perspective by capturing variability, longer time horizons, and populations that fall outside trial design.
Per ush, SigVigil doesn’t resolve the limitations of this data. It’s just a structured way to examine otherwise difficult-to-access patterns.
Within the MiSOF, SigVigil serves as the intermediate layer between phenotypic stratification and longitudinal modeling. The signals identified here inform the trajectory analyses developed in NeuroTrack, where adverse effects are examined in terms of timing, persistence, and long-term impact on treatment continuation.
Cheers,
Angie X.
This project is open source at github.com/axshoe/SigVigil.