Only those efficacy outcomes identified in the review protocol are reported below (section 3.2, Table 2). See Appendix 4 for detailed efficacy data.
4.6.1. Change in Phe Levels
In Study PKU-016, in the population of Phe responders, the mean (SD) Phe blood levels at baseline were 680,2 (435.44) |imol/L in the SAP + diet arm and 789.5 (464.97) μmol/L in the placebo + diet arm (Table 17). At week 13, levels in the SAP + diet arm decreased by approximately 30% from baseline to ▬μmol/L and remained largely unchanged in the placebo + diet arm (i.e., ▬(μmol/L). After week 13, levels remained relatively stable in the SAP + diet arm, but decreased in the placebo + diet arm as these patients crossed over to receive open-label SAP + diet during the open-label treatment period. At week 26, mean (SD) Phe levels were similar between the two groups: ▬) |μmol/L in ▬ μmol/L in the placebo + diet arm. No statistical comparisons were the SAP + diet arm and | conducted between the treatment groups.
In contrast to Study PKU-016, patients enrolled in the SPARK study were required to be within the Phe blood level target range of 120 to 360 µmol/L at study entry. The mean (SD) baseline values were ▬ (▬) µmol/L in the SAP + diet arm and ▬ (▬) µmol/L in the diet alone arm (Table 18). Phe blood levels remained relatively constant in both treatment arms throughout the duration of the study. At week 26, the mean (SE) change from baseline was —10.1 (▬) µmol/L in the SAP + diet arm and 23.1 (▬) µmol/L in the diet alone arm. The LS or adjusted mean difference between the treatment arms was not statistically significant (i.e., —33.2 [95% CI, —94.8 to 28.4]; P = 0.290). In the SPARK study, nine patients (33.3%) in the SAP + diet arm maintained Phe blood levels in the target range (120 to 360 µmol/L) throughout the study compared with three patients (10.3%) in the diet alone arm (Table 19).
4.6.2. Neuropsychiatric and Neurocognitive Effects
The effects of SAP treatment on neuropsychiatric and neurocognitive effects using various different instruments were investigated in Study PKU-016, whereas in the SPARK study, only effects of treatment on neuromotor developmental milestones were reported. The primary objective of PKU-016 was to evaluate the effect of SAP treatment on ADHD symptoms in patients who were Phe responders with ADHD symptoms at baseline (i.e., n = 19 patients in each treatment arm). The proportion of patients in the Phe responder population with CGI-I ratings of 1 (very much improved) or 2 (much improved) was the second primary end point in PKU-016. Other instruments included as secondary end points were the CGI-S, HAM-A, HAM-D, and the GEC, MI, and BRI index T scores of the BRIEF rating scale. In the SPARK study, neuromotor status assessment was performed using the standardized Bayley-III Scales of Infant and Toddler Development for patients younger than 3.5 years of age and the Wechsler Preschool and Primary Scale of Intelligence for patients between 3.5 and 4 years of age.
The first primary end point in PKU-016 was the change from baseline to week 13 in the ADHD-RS/ASRS total score and a higher score on either rating scale indicates greater severity of ADHD symptoms (Table 8). In both treatment groups, the mean (SE) ADHD-RS/ASRS total score decreased from baseline to week 13 (i.e., —9.1 [2.2] in the SAP + diet arm and —4.9 [2.0] in the placebo + diet arm), suggesting improvement, although the MCID is unknown. In each arm, the change from baseline to week 13 was statistically significant (▬), although the difference between arms was not (i.e., —4.2 [95% CI, —8.9 to 0.6]; P = 0.085). At week 26, the difference between arms was also not statistically significantly different (▬).
For the ADHD-RS/ASRS subscale score of Inattention, in both treatment arms, the mean (SE) subscale score decreased from baseline to week 13 (i.e., —5.9 [1.4] in the SAP + diet arm and —2.5 [1.3] in the placebo + diet arm) (Table 9). In each arm, the change from baseline to week 13 was statistically significant (▬) and the difference between arms was also statistically significant (i.e., —3.4 [95% CI, —6.6 to —0.2]; P = 0.036), although the MCID is unknown. At week 26, the difference between arms was no longer statistically significantly different (▬).
For the ADHD-RS/ASRS subscale score of Hyperactivity-Impulsivity, in both treatment arms, the mean (SE) subscale score decreased from baseline to week 13 (i.e., —3.3 [1.1] in the SAP + diet arm and —2.3 [1.0] in the placebo + diet arm) (Table 10). The change from baseline to week 13, however, was statistically significant only in the SAP + diet arm (▬), but not in the placebo + diet arm (▬). The difference between arms was also not statistically significant at either week 13 (i.e., —1.0 [95% CI, —3.4 to 1.4]; P = 0.396) or week 26 (▬).
The second primary end point in PKU-016 was the proportion of patients with a rating of 1 or 2 in the CGI-I at week 13 in the population of Phe responders (Table 11). The proportion of patients with this outcome was 26.3% in the placebo + diet group and 21.7% in the SAP + diet group at week 13 and the difference was not statistically significantly different (i.e., 0.87 [95% CI, 0.46 to 1.64]; P = 0.670). At week 26, however, the proportion of patients with this outcome in the placebo + diet arm (which included patients who crossed over from placebo to open-label SAP) was ▬%compared to ▬%in the SAP + diet arm. The treatment difference was statistically significant in favour of the placebo + diet arm (▬)For the secondary outcome of CGI-S (where lower scores indicate improvement), the mean [SE] reduction in scores from baseline to week 13 in both treatment arms was statistically significant (i.e., -0.6 [0.2] in the SAP + diet arm and -0.5 [▬] in the placebo + diet arm; (▬) J); however, the difference between arms was not statistically significantly different at week 13 or week 26 (Table 12).
In PKU-016, the mean change (SE) from baseline to week 13 in the HAM-A in Phe responders was —3.2 (▬) in the SAP + diet arm and —3.6 (▬) in the placebo + diet arm, both of which were statistically significant (▬) (Table 13). A decline in the HAM-A or HAM-D score represents an improvement in symptoms. The difference between treatment arms, however, was not statistically significant (i.e., 0.4 [95% CI, —1.5 to 2.3]; P = 0.669). Similarly, the treatment difference at week 26 was also not statistically significant (i.e., —0.5 [95% CI, —2.4 to 1.4); P = 0.590). A similar pattern was observed for the HAM-D results in Phe responders. The mean change (SE) from baseline to week 13 in the HAM-D was —2.1 (▬) in the SAP + diet arm and —2.5 (▬) in the placebo + diet arm, both of which were statistically significant (▬) (Table 14). The difference between treatment arms, however, was not statistically significant (i.e., 0.4 [95% CI, —1.1 to 1.9]; P = 0.588). Similarly, the treatment difference at week 26 was also not statistically significant (▬).
In Study PKU-016, in Phe responders younger than 18 years of age, the BRIEF-Parent was used, which was completed by parents (Table 15). In those aged 18 years and older, the BRIEF-A was used, which was self-administered (Table 16). For each BRIEF assessment, results are reported separately for the three index scales (GEC, MI, and BRI). For the BRIEF-Parent results (i.e., patients younger than 18 years), the differences between treatments at week 13 were statistically significantly different for the GEC (i.e., —4.1 [95% CI, —7.9 to —0.3]; P = 0.034) and MI (i.e., —4.4 [95% CI, —8.5 to —0.2]; P = 0.038, but not for the BRI index scale (i.e., —3.4 [95% CI, —6.8 to 0.0]; P = 0.053), although the MCID is unknown. For the BRIEF-A results (i.e., patients ≥ 18 years), there were no statistically significant differences between groups for any of the three index scales, GEC, MI, or BRI.
In the SPARK study, the only measure of neurodysfunction that was reported was the proportion of patients who were classified as either normal or abnormal with regard to neuromotor developmental milestones in four areas of assessment: fine motor, gross motor, language, and personal-social (Table 23). In all four areas, the majority of children were classified as normal and there were no statistically significant differences found between the SAP + diet and diet alone arm for any of the areas of assessment.