Menstrual Cycle Phase Effects on Cognitive Performance Assessed by the NIH Toolbox

Merlen Pivaral Arriola

Faculty Mentor:

Dr. Mary Hegarty
Department of Psychological and Brain Sciences

Introduction

Cognitive performance is influenced by a wide range of biological factors, including fluctuations in ovarian hormones across the menstrual cycle (Hampson, 1990; Sawicka et al., 2025). Across the typical menstrual cycle, levels of estradiol and progesterone vary systematically, with estradiol rising during the follicular phase and peaking around ovulation, followed by increases in progesterone during the luteal phase. These hormonal changes have been linked to subtle variations in cognitive processes such as working memory, attention, and executive functioning (Hampson, 1990; Hussain et al., 2014).

Empirical findings on menstrual cycle effects on cognition have been mixed. Some studies report modest enhancements in processing speed, selective attention, or working memory during the ovulatory phase, when estradiol levels are highest (Jang et al., 2025; Sawicka et al., 2025). Other research suggests small or inconsistent effects, particularly in applied or real-world contexts. For example, studies of academic performance across the menstrual cycle indicate that highly motivated college women often show little to no decline during menstruation, suggesting that hormonal fluctuations do not necessarily translate into meaningful cognitive impairments (Bernstein, 1977). 

Taken together, existing research suggests that although ovarian hormones may have subtle effects on cognition, these influences are frequently small and may be overshadowed by individual differences, task characteristics, and contextual factors (Mumenthaler et al., 2001). This pattern raises questions about whether menstrual cycle-related cognitive differences (often small and task-dependent) are detectable using standardized cognitive batteries designed to assess everyday functioning, which may be less sensitive to subtle hormonal effects (Hampson, 1990; Sawicka et al., 2025).

Understanding whether menstrual cycle-related variability meaningfully impacts cognitive performance is increasingly relevant for both cognitive neuroscience and women’s health research. Attention, memory, and executive functioning play central roles in daily functioning (Bernstein, 1977), yet many commonly used cognitive paradigms have rarely been examined through the lens of within-person hormonal variation. Importantly, much of the existing literature relies on highly controlled reaction-time or laboratory-based tasks, which may be more sensitive to subtle hormonal effects but less representative of everyday cognitive functioning (Hampson, 1990; Sawicka et al., 2025; Xu et al., 2022).

In contrast, the NIH Toolbox Cognitive Battery uses standardized, validated tasks that are still experimentally rigorous but are designed to capture cognitive functions relevant to daily life, such as attention, memory, and executive function (Zelazo et al., 2014). Despite its widespread use in research and clinical settings (Fox et al., 2022), the NIH Toolbox has received little attention in studies examining menstrual cycle phase effects on cognition. Examining menstrual cycle-related variability using this battery therefore addresses an important gap in the literature, helping to bridge findings from basic cognitive neuroscience with applied cognitive assessment.

The current preliminary analysis focuses on NIH Toolbox measures of attention, memory, executive function, and processing speed. Using a within-subjects design, cognitive performance was assessed during the follicular phase, when both estradiol and progesterone are relatively low, and during the ovulatory phase, when estradiol peaks. The primary aim of this study was to examine whether standardized cognitive performance differs across these two phases of the menstrual cycle. Given the exploratory nature of this study and the small sample size, any observed effects are expected to be small and should be interpreted as preliminary.

Hypotheses 

Performance on NIH Toolbox measures of processing speed, attention, and working memory will be modestly higher during the ovulatory phase compared to the follicular phase.

Methods 

Participants 

The final sample consisted of 19 women aged 18 to 40 years. Participants were healthy adults recruited from UCSB, and all received either monetary compensation or course credit for their participation. To determine eligibility, participants completed an online screening questionnaire. Eligibility criteria included a regular menstrual cycle (monthly), no history of reproductive conditions, and no current use of hormonal contraceptives, steroids, or stimulant medications. A screening survey collected demographic information (age, ethnicity, wake time) as well as menstrual cycle details (dates of the last three cycles and typical cycle length).

A within-subjects design was employed, with all participants completing all experimental conditions. Each participant completed two laboratory sessions: one scheduled during the early follicular phase and one during the ovulatory phase. Session order was counterbalanced across participants. All sessions were conducted in a quiet testing room at the UCSB Center for Virtual Environments and Behavior. A well-trained graduate student provided standardized instructions and monitored all testing procedures, with trained research assistants assisting in data collection and study administration.

Cycle Phase Assessment 

Cycle phase was estimated using self-reported tracking methods (calendar or app). The ovulatory phase was confirmed using daily luteinizing hormone (LH) ovulation tests. While the onset of menstruation is easily identified, the ovulatory window is less obvious. Ovulation was captured by measuring LH, a hormone that spikes shortly before ovulation and signals the transition to the luteal phase. Participants were instructed to take an LH test each day following the end of their period, preferably in the afternoon. Physical materials provided included LH test strips, testing cups, and detailed written instructions on how to administer the tests, read the results, and follow best practices. Participants were also encouraged to contact the primary graduate student conducting the study if they had any questions or concerns.

Saliva samples were collected twice per session (pre- and post-task; four total) to support hormone-level verification, though these samples were not analyzed for the present report.

NIH Toolbox Cognitive Battery

Participants completed the iPad-based NIH Toolbox Cognitive Battery, which provides age-adjusted standard scores (mean = 100) to allow for fair comparison of performance against a representative sample of the same age, regardless of when cognitive tasks were completed. Tasks included the following:

  • Dimensional Change Card Sort Test. This task assesses key components of executive function, such as cognitive flexibility and attention. Participants were asked to sort a series of picture pairs to match a target picture, with cards varying along two dimensions (e.g., color and shape). Sorting rules switched between tasks, requiring participants to adjust their responses accordingly.
  • Face-Name Associative Memory Exam. This task tests episodic memory. Participants were asked whether a face-name pair was easy to remember. After a delay (i.e., completing several tasks in between), they were asked to recall which faces and names they had seen earlier.
  • Flanker Inhibitory Control and Attention Test. This task assesses inhibitory control and attention. Participants focused on a central stimulus while inhibiting attention to flanking stimuli. Rows of fish were shown, with trials that were either congruent (all fish facing the same direction) or incongruent (the central fish facing the opposite direction).
  • List Sorting Working Memory Test. This task measures working memory. Participants recalled and sequenced items presented visually and auditorily, ordering them from smallest to largest across single-category and dual-category trials. Performance was scored as correct or incorrect, regardless of whether individual items were correctly recalled.
  • Oral Symbol Digit Test. This task measures processing speed. Participants orally identified the number corresponding to each symbol in a grid, using a key provided at the top of a printed reference page.
  • Picture Sequence Memory Test. This task assesses episodic memory. Participants viewed sequences of illustrated scenes (e.g., camping or fair events) and were then asked to reproduce the sequences in the order in which they were presented.
  • Rey Auditory Verbal Learning Test. This task evaluates immediate memory and verbal learning. Participants listened to an audio recording presented by the iPad of a list of 15 unrelated words and completed three trials in which they recalled as many words as possible. Following the trials, participants were asked to freely recall the words once more.

Procedure

Each session lasted approximately 1.5 hours, resulting in a total participation time of approximately 3 hours across both sessions. In each session, saliva samples were collected twice: once immediately prior to the navigation tasks and once following completion of the navigation tasks. Participants were provided with collection tubes and received standardized instructions for self-collection; each saliva collection required no more than a few minutes. Saliva collection, storage, and analysis were conducted anonymously and were not linked to participant identities.

Following saliva collection, participants completed an immersive navigation task (data not analyzed in the present report), followed by the NIH Toolbox Cognitive Battery. Participants also completed self-report surveys assessing spatial anxiety, navigation experience, and demographic variables, including date of birth and highest level of education.

Data Analysis

NIH Toolbox age-adjusted standard scores were first organized in a spreadsheet with columns for participant ID, menstrual cycle phase (follicular, ovulatory), session, DSP type, and each of the seven NIH Toolbox cognitive tasks. Descriptive statistics (means and standard deviations) were calculated to summarize overall performance and to examine variability across menstrual cycle phases prior to inferential testing (see Table 1).

A within-subjects design was used to examine within-person differences in cognitive performance between the follicular and ovulatory phases, which differ in estradiol levels. Paired-sample t-tests were conducted for each cognitive outcome to account for repeated measurements within individuals. Cohen’s d was calculated to estimate effect sizes and aid interpretation of the magnitude of any observed phase-related differences. All cognitive outcomes were analyzed using age-adjusted standard scores to control for age-related variability. All statistical analyses were conducted in R.

Results

Paired-sample t-tests revealed no statistically significant differences in cognitive performance between the follicular and ovulatory phases for any NIH Toolbox task (all p > .05; see Table 1). Effect sizes were uniformly small (Cohen’s d range = -0.264 to 0.177), indicating minimal practical differences between phases. Although the Dimensional Change Card Sort and List Sorting Working Memory tasks showed numerically higher scores during the ovulatory phase, these trends did not approach significance. 

Overall, means and standard deviations were comparable across phases, suggesting that cognitive performance was relatively stable within this sample. No consistent pattern favored one phase over the other. 

These results should be interpreted cautiously given the small sample size (n = 16–18 across tasks). Participants who did not complete all tasks were excluded from the analysis, which may have limited the power to detect subtle cycle-related effects. While previous research has suggested potential cognitive fluctuations across the menstrual cycle, the present study did not observe systematic differences.

Discussion

The present preliminary analysis found no statistically significant differences in standardized cognitive performance between the follicular and ovulatory phases across NIH Toolbox measures. Although several tasks, such as the List Sorting Working Memory Test and the Dimensional Change Card Sort, showed numerically higher scores during the ovulatory phase, these differences were small in magnitude and did not reach statistical significance. Effect sizes across all measures were consistently small, suggesting that even if subtle hormonal influences exist, their impact on standardized cognitive performance in this sample is limited.

These findings align with prior research indicating that menstrual cycle effects on cognition tend to be modest, task-specific, and often overshadowed by individual variability (Hampson, 1990; Sawicka et al., 2025). Because the NIH Toolbox emphasizes standardized, ecologically relevant cognitive assessments rather than narrowly defined reaction-time paradigms, the absence of significant menstrual cycle effects in the present study suggests that any hormone-related cognitive fluctuations may have limited impact on everyday cognitive functioning.

Cycle-related effects may be more detectable in low-level reaction-time paradigms than in complex standardized assessments like the NIH Toolbox, which use age-adjusted scoring that may attenuate sensitivity to small within-person fluctuations. The absence of significant differences also supports evidence that most everyday cognitive functioning remains stable across the menstrual cycle (Bernstein, 1977).

Several limitations should be considered when interpreting these preliminary results. First, the sample size was small, limiting statistical power to detect subtle effects. 

Second, although hormone samples were collected, they were not analyzed for the present report; without direct hormone quantification, some variability in hormonal status may have gone undetected despite LH confirmation. 

Third, participants were drawn from a Western, educated, industrialized, rich, and democratic (WEIRD) university population, which limits the generalizability of the findings. Prior research indicates that cultural, educational, and socioeconomic factors significantly influence cognitive test performance; therefore, effects observed (or not observed) in this sample may not extend to more diverse populations (Casaletto et al., 2015).

Future research should include larger and more diverse samples, incorporate direct hormone assays, include more sessions, and examine additional menstrual cycle phases (e.g., luteal phase). Additionally, participants may experience factors such as workload or stress that influence cognitive performance and may interact with the menstrual cycle (Xu et al., 2022), potentially limiting the generalizability of the findings. A large meta-analysis found no robust evidence for systematic changes in women’s cognitive performance across the menstrual cycle, highlighting methodological limitations in prior work and the need for hormone-verified, well-powered within-person designs (Jang et al., 2025). 

Overall, although no significant phase-based differences were identified, this preliminary study contributes to the literature by providing a within-subject examination of menstrual cycle effects using standardized cognitive measures, reinforcing the conclusion that core cognitive abilities remain largely stable across the menstrual cycle.

References

Bernstein, B. E. (1977). Effect of menstruation on academic performance among college women. Archives of Sexual Behavior, 6(4), 289–296. https://doi.org/10.1007/BF01541202 

Casaletto, K. B., Umlauf, A., Beaumont, J., Gershon, R., Slotkin, J., Akshoomoff, N., & Heaton, R. K. (2015). Demographically corrected normative standards for the English version of the NIH Toolbox Cognition Battery. Journal of the International Neuropsychological Society, 21(5), 378–391. https://doi.org/10.1017/S1355617715000351

Fox, R. S., Zhang, M., Amagai, S., Bassard, A., Dworak, E. M., Han, Y. C., Kassanits, J., Miller, C. H., Nowinski, C. J., Giella, A. K., Stoeger, J. N., Swantek, K., Hook, J. N., & Gershon, R. C. (2022). Uses of the NIH Toolbox® in clinical samples: A scoping review. Neurology Clinical Practice, 12(4), 307–319. https://doi.org/10.1212/CPJ.0000000000200060 

Hampson, E. (1990). Variations in sex-related cognitive abilities across the menstrual cycle. Brain and Cognition, 14(1), 26–43. https://doi.org/10.1016/0278-2626(90)90058-V

Hussain, D., Shams, W. M., & Brake, W. G. (2014). Estrogen and memory system bias in females across the lifespan. Translational Neuroscience, 5(1), 35–50. https://doi.org/10.2478/s13380-014-0209-7

Jang, D., Zhang, J., & Elfenbein, H. A. (2025). Menstrual cycle effects on cognitive performance: A meta-analysis. PLOS ONE, 20(3), Article e0318576. https://doi.org/10.1371/journal.pone.0318576

Mumenthaler, M. S., O'Hara, R., Taylor, J. L., Friedman, L., & Yesavage, J. A. (2001). Relationship between variations in estradiol and progesterone levels across the menstrual cycle and human performance. Psychopharmacology, 155(2), 198–203. https://doi.org/10.1007/s002130100700 

Sawicka, A. K., Michalak, K. M., Naparło, B., Bermudo-Gallaguet, A., Mataró, M., Winklewski, P. J., & Marcinkowska, A. B. (2025). Menstrual cycle phase influences cognitive performance in women and modulates sex differences: A combined longitudinal and cross-sectional study. Biology, 14(8), 1060. https://doi.org/10.3390/biology14081060

Xu, M., Chen, D., Li, H., Wang, H., & Yang, L. Z. (2022). The cycling brain in the workplace: Does workload modulate the menstrual cycle effect on cognition? Frontiers in Behavioral Neuroscience, 16, Article 856276. https://doi.org/10.3389/fnbeh.2022.856276

Zelazo, P. D., Anderson, J. E., Richler, J., Wallner-Allen, K., Beaumont, J. L., Conway, K. P., Gershon, R., & Weintraub, S. (2014). NIH Toolbox Cognition Battery (CB): Validation of executive function measures in adults. Journal of the International Neuropsychological Society, 20(6), 620–629. https://doi.org/10.1017/S1355617714000472

Table 1

Paired-Sample t-Tests Comparing Cognitive Performance Across Menstrual Cycle Phases

Task

Follicular Mean (SD)

Ovulatory Mean (SD)