Public Workshop: Safety Assessment for Investigational New Drug Reporting

Just another WordPress site

Public Workshop: Safety Assessment for Investigational New Drug Reporting

– Okay we’re gonna go ahead and get started

Good morning everyone I’m Greg Daniel, Deputy Director

in the Duke Margolis Center for Health Policy

and I’d like to welcome all of you to our workshop

Safety Assessment For Investigational

New Drug Safety Reporting which is

being convened under a cooperative agreement between the FDA and the Duke Margolis Center We’re pleased to be joined for these discussions by a range of leading experts from across the stakeholder community including government, academia, industry, and others for productive exchange of the issues at hand Today’s focus of this workshop will be the FDA’s draft guidance that was issued in December of 2015 entitled Safety Assessment For IND Safety Reporting This guidance is the latest in a series of policy developments issued by the FDA to improve the efficiency and quality of IND safety reporting and to help sponsors of INDS to identify and evaluate meaningful safety information that must be reported to the FDA There were a number of recommendations provided in the guidance which outlined a systematic approach for safety assessment and seek to clarify key issues relevant to sponsors’ review of aggregate data for IND safety reporting Examples of the important issues addressed include how sponsors might evaluate unblinded data from ongoing trials and determine thresholds for IND reporting while maintaining trial integrity Additionally the 2015 guidance provided information on developing a safety surveillance plan and defining the role and responsibilities of the safety assessment committee Comments received thus far to the FDA on the draft guidance highlighted some potential implementation challenges with these issues and today’s goal is to hear from a range of stakeholder perspectives on and experience with implementing the draft guidance to better understand the approaches being used, the key challenges faced, and to identify areas and potential approaches that could inform the final guidance We’re looking forward to having a robust discussion on these issues and hope you will contribute to the discussion throughout the day Before we get started I’d like to quickly go over the agenda We’re going to have a series of presentations that will frame and tee up key issues for discussion First we’ll hear from senior leaders at the FDA Bob Temple and Peter Stein will provide a background on IND safety reporting, an overview of the recommendations in the guidance, and challenges that they’ve heard from sponsors with implementation We’ll then turn to Janet Wittes and Robert Baker who will be presenting key stakeholder views from outside of the agency and raise important considerations for implementing the draft guidance recommendations

After these presentations we’ll spend the rest of the day taking a deeper dive into several of the major issues covered in the guidance In session one we’ll discuss and focus on challenges and best practices for designating anticipated adverse events and systematic approaches for predicting the rates of these events Following that session we’ll take our first break at around 10:50 and then reconvene at 11 for session two And in that session we’ll address two separate but closely related issues that are related to unblinding of data for safety assessment analyses and whether trial integrity can be maintained with periodic unblinding by a group other than the data monitoring committee The panel will then consider related to that intended functions of the safety assessment committees and whether such functions could be addressed through other bodies such as the DMCs We’ll then break for lunch at 12:30 and then reconvene for session three at 1:30 to discuss key issues pertaining to when events in clinical trials may indicate more frequent occurrence in the treatment group compared to the control and the issues and challenges related to that And then we’ll take a quick break at 2:45 before our last session of the day which will tackle key data and methods issues associated with conducting aggregate analyses of patient level data pooled across study arms or multiple studies Just a few housekeeping notes before we kick off with the opening presentations As you’ll note in the agenda each session will begin with a presentation or two followed by reactor comments and then moderated panel discussion We’ll also have time in each session to set aside for broader discussion with the audience and for those in attendance we have roving mics for you to use throughout the day Given the storms impacting the east coast over the last coupla days at least one or two speakers will be calling in during the panel discussions I wanna remind everyone that this is a public meeting The event is being live webcast and so everything you say will be part of the record and for those of you joining us through the webcast audience we encourage you to participate in the day’s discussion by sending questions that you have to a special email address and here it is [email protected] And it’s not written out there so [email protected] and our staff will forward those questions to the moderator For those of you in the room feel free to help yourself to coffee and beverages throughout the day and then, lastly, during the lunch break lunch will be on your own There are a number of restaurants close by and we’ll be starting promptly at 1:30 so if you’re not able to finish feel free to bring anything that you have, your food, back into the room here Finally just as a reminder although this meeting is being convened under a cooperative agreement with the FDA this is not a federal advisory committee We won’t be trying to reach consensus, we won’t be taking a vote, and this meeting will be successful if there’s an exchange of lots of ideas and open discussion So let’s go ahead and get started I’d like to turn this over to our FDA colleagues who are here to help kick us off first Bob Temple, Deputy Center Director for Clinical Science Center for Drug Evaluation and Research at the FDA Bob will give a brief presentation on the regulatory history of IND safety reporting After Bob, Peter Stein Deputy Director of the Office of New Drugs Center For Drug Evaluation and Research at FDA who will discuss some of the challenges that he and FDA has heard from sponsors faced with implementing the 2015 draft guidance Thank you Bob – Okay, good morning Who would’ve guessed this would all be so hard? Okay No (Greg whispering) I push the arrow – Yeah – Okay so that’s what this is about Alright that’s a little bit of the history We first put out a proposed rule in 2003 Then we got final rule 2010 and guidance in 2012 and that’s last one it’s not 2005 It’s not 205 It’s 2015 And it’s worth remembering why we did this There’s a perfectly obvious requirement to report important adverse events under the IND so you can monitor patient safety and the way the regulation was written it just said if there’s anything serious associated with the use of the drug you should report it and that meant a reasonable possibility

but it wasn’t defined any further So what we were getting is every, in the cardiovascular outcome trial, we’d get every serious event that happens and these people would have heart attacks, strokes, and all those kinds of things and they’d all be reported And one it was a lot of work for us and not very helpful ’cause those weren’t mostly related but it also sorta kept people from looking properly You weren’t really diving into the data to see what it meant So before the rule in 2010 the events reported had to be associated with the use of the drug and that meant there was a reasonable possibility and we tried to eliminate events that were probable manifestations of the underlying disease or that were adverse events common in the study population even if they weren’t exposed to the drug Things that just happened in people at that age Heart attack, strokes in older people and so on and people were also reporting study endpoints So we didn’t want any of that So in 2010 we introduced the term suspected adverse reaction that made it clear that there had to be a reasonable possibility which also meant that there was some evidence to suggest a causal relationship Not just that they were taken together Not that the event happened in someone who took it but that there was reason to believe it might be related And then, as usual, you had to report within seven to 15 days any of these things that turned out And the rule identified three different kinds of things Two of which are easy and not a problem The third of which is why we’re here Uncommon and strongly Uncommon events that are strongly associated with drug exposure Stevens-Johnson and agranulocytosis Things like that Those are usually drug related when they happen It’s pretty easy Then there’s some odd things that aren’t commonly associated with drug exposure but really don’t happen in the population and the example we gave was tendon rapture with certain drugs But the final one actually that says three easy and one hard That’s wrong It’s two easy and one hard The first two I did were the easy ones and then the final ones is doing an aggregate analysis of an unexpected adverse event that occurs more frequently in drug treatment group than in control Unexpected, by the way, means it’s not in the investigators for sure That’s all unexpected means We’ve, from the beginning, tried to get people to understand what kinds of unexpected events could be anticipated in the population We talk about that a lot So the two things that need to be looked at with an aggregate analysis are events that come more frequently in the drug treatment group Serious events and things that we already suspect ’cause they’re in the investigators for sure but they’re occurring more commonly than we thought they were gonna We haven’t had a lot of discussion of that So in 2012 guidance we tried to clarify what all this meant and we asked for accumulate periodic review of the accumulating safety data and appropriate reporting Obviously to protect trial integrity sponsors need a predefined safety monitoring plan with processes and procedures for review of the safety data including the frequency of review And, of course, they also have to define how they’re gonna keep people from knowing results they shouldn’t know about And the guidance addresses the role of unblinding in determining whether a single occurrence of an event needs to be reported Obviously you have to know whether, if there’s agran, you have to know whether the person was on drug or not And then, as part of the aggregate analysis, if you just report the events it’s of no use to anybody You really have to unblind it to make sure anybody knows what’s happened We wrote a further guidance in 2015 to try to clarify all the issues that had arisen and they’re not easy What’s the reporting threshold? I mean if you’re looking at some event that’s expected in the population, heart attacks or something like that, a how do you know when to unblind and look? And then when should you report it? If it’s, you know, 20 versus 18? How do you decide it? And that’s a very hard question I know we’re gonna talk about So we had a meeting in 2015 and we discovered, we learned, that most people were not really doing what we had hoped they were doing They were still reporting all of these events Couple companies, I don’t want to name them, had been looking at the data properly

Lilly had published work on this long ago and had been doing it for a long time but we found that most people weren’t doing it So we took a look, actually, at the analysis of serious adverse events submitted on oncology trials and they just mostly weren’t informative They were reporting everything that happened not really distinguishing So thoughtful discrimination of adverse events that are drug related from those that are not was generally not occurring Most people were reporting just what they always did Sorta data dump The guidance was intended to enable sponsors to evaluate unblinded data from ongoing trials when that was reasonable to determine if the threshold for IND reporting, not well defined, while maintaining trial integrity So they needed a safety surveillance plan They should say what they were gonna do so everybody would know and we recommended we’ve taken the official letters away but a safety assessment committee A group whose obligation was to look at the unblinded data A very limited group We didn’t wanna damage the trial or anything like that We also suggested that any analysis like this should look at data from all ongoing trials not just one by one and factors to consider in the reporting threshold but it’s had to write a rule on that Okay I’m done That’s what we were trying to do and now here comes Peter (papers rustling) (footsteps tapping) (footsteps tapping) – Well thanks and good morning I appreciate everyone weathering the storm particularly those who came from the northeast What I’m gonna try to do is to sort of follow on Bob’s comments who I think laid out a lot of the issues and, let’s see, talk just very briefly about some of the particular aspects of the 2015 guidance I’m gonna mostly focus on the feedback we’re received What have we heard from sponsors and from other individuals about some of the challenges in implementation of the guidance? And some thoughts and strategies that we’ve been hearing about regarding how these issues in the guidance can be best addressed So lemme start by just touching on some of the particular aspects of the guidance I think most folks here are quite familiar with it and Bob has touched on this The 2015 guidance expanded on the safety surveillance plan It provided a greater definition of the rules and responsibilities of the safety assessment committee, the SAC It talked about membership It talked about the data that the SAC would review It provided a clear recommendation for unblinded analysis for regular, periodic, unblinded analysis of aggregated data pooled across the program that the SAC would review And this was focused, as Bob mentioned, on those events that really only could be identified as drug related by looking at aggregate analysis and by looking at unblinded aggregate analysis Oh the guidance did talk about an alternate approach which was to look at the incidents of events in the blinded aggregate database and look for whether it surpassed a trigger, a threshold value It did have, I think, a useful discussion on factors to consider and how one determined whether there was an association with drug or not although this is a, obviously, a complicated issue which requires judgment and there’s certainly gonna be uncertainty there And then it did talk about the importance of maintaining the blind of the need for the firewall and ensuring that was appropriately implemented So I think Bob’s already talked about this and we’ll be thinking about this through the day There’s clearly several different sort of categories, if you will, of the types of IND safety reports The first two categories those that are typically drug associated types of events Straightforward That can be, obviously, seen in blinded analysis If you see a Stevens-Johnson clearly there’s a substantial concern that it may be related to the drug The other category are those events that can occur in a background population but are unusual for this population in the study So example of, for example, myocardial infarctions occurring in a quite young population would be an event that would be potentially seen in other populations but would raise a concern about drug relationship in a young population But really what we’re talking about are those events that are part of the background related to the concomitant medications perhaps related to the underlying disease or things that can occur simply related to the background population

and its comorbidities And this will be seen in the background at the trial What we’re really looking for is when there’s a meaningful imbalance in those events and that could really only be detected, potentially, by looking at unblinded analysis in aggregate And the challenge there is how do we do that? And how do we divide those events for appropriate reporting so we can ensure that patients are informed and appropriate decisions were made about the trial? So what have we heard? We’ve been getting feedback on the 2015 guidance and what I’ve tried to do is to categorize the feedback into a couple of different categories of types of points that had been made So I’ll just give you highlights of the categories and then I’ll talk a little bit more about each of these categories and what the specific challenges have been So first trial integrity The concern around disclosure of unblinded data related to the unblinding and expanded numbers of individuals who will see the unblinded information The greater trial complexity We have a new infrastructure with the SAC now also supporting the DMC The SAC, the sponsor’s internal committees, and so the complexity of the trial being increased resource requirements and also the overlapping responsibilities there Separating signal from noise This is clearly a key issue How do we determine where there’s a meaningful imbalance when we’re looking at multiple comparisons? We’re looking at multiple analyses We have ongoing trials How do we cool and how do we look at the imbalances in a way to make sure that we’re identifying important imbalances but not reporting false positives? And then there were comments about what is the need for the SAC? Can the DMC role be expanded to provide appropriate reporting where there is an important imbalance? So let me drill down to these a little bit more in the next few minutes So with regard to trial integrity we’ve heard about the concerns regarding repeated unblinding of the interim data Now for the SAC, and also as previously for the DMC, there is more internal sponsor personnel potentially involved whether directly participating in the SAC or in supporting the SAC, obviously, this makes the assurance of a firewall a little bit more challenging There’s an impact, of course, if there’s more IND safety reporting and unblinded reports of more common events could that impact trial integrity because of changes in behavior of investigators and management of patients? Challenges related to trial complexity overlapping responsibilities This is a new committee It has to be folded in to the ongoing program infrastructure of the DMC There’s the internal sponsor’s safety committees and the requirements for supporting that, the challenges for setting this up, and making sure it’s appropriately integrating And we saw a number of comments about the complexity of the relationship How do we assure appropriate responsibilities allocated to the SAC versus the DMC and the internal safety committee from the sponsor? How do they communicate? How is it clear who’s responsible for what? Are there potential for breakdown because of the greater complexity of the numbers of committees and the numbers of individuals responsible for assuring safety of the patients participating? And then there were quite a few comments about the issues around separating signal from noise The challenges of pooling There were a number of comments about that we’re dealing with a program that’s ongoing There are trials at different stages They may be for different populations, different indications, different doses, different regimens, designed differently, randomization ratios may not be the same How do we pool completed and ongoing trials, different designs, and different populations in a way that is meaningful and accounts for these differences and avoids getting false positive imbalances related simply to these complexities of the program? And when do we determine how could we determine when an imbalance should be reported? We’re looking at a challenge of multiplicity here We have multiple different comparisons ’cause there’ll be multiple different adverse events We have multiple different looks at those How do we assure that when we see an imbalance it’s really meaningful? Do we report, as Bob said, 20 versus 18? 20 versus two? What is our threshold for determining that there really is an association because, in any large trial database, there will be imbalances in one or the other direction that may not be meaningful How do we decide which of those are meaningful, are important to report so that we can make patients aware and appropriately implement changes to the trial if needed to assure patient safety? We did hear comments about the alternate approach The idea that we would setup a threshold for different adverse events and only report when the aggregate, blinded data exceeded that threshold Okay well how do you decide what that threshold should be? Where do you get the data from that says this incidence is the threshold? Above that we should unblind that particular event and report it And I think we’ll hear some creative solutions today to that particular challenge We also heard about the expertise of the SAC

Do they really understand the drug and know what the likely adverse events might be that might be seen related to the drug or in the background population? How do we get the right experts looking at this data in the most informed way? And finally there were comments about whether the SAC was needed Could this be managed by the internal safety committee from the company in concert with the DMC? Would that mean a change in the role for the DMC? How would that be done? Could they manage to look at unblinded reports? So that was also something that was raised So the goals for the meeting today, as I think you’ve already heard but I’d (chuckling) like to just re-emphasize, is really to understand input from all of you about the challenges you’re seeing Are these the challenges? Are there other challenges? And, in particular, what are the solutions that you’ve come to? How have you addressed these concerns? We’d really like to hear your innovative, creative ideas about how these challenges can best be met and, obviously, our intent here is to identify the issues and the approaches that can inform as we think about moving from the draft guidance into the final guidance So I’m looking forward to a very interactive discussion today We certainly appreciate your weathering the storm if you’ve come from the northeast and appreciate your attention and your input So thank you very much We’ll move on – Great thanks Bob and Peter So next we’re gonna hear from two perspectives outside of the FDA on some of their experiences with these challenges Robert Baker is Vice President of Global Medical Affairs at Eli Lilly who will give a brief presentation based on the work being completed by the TransCelerate bio-pharmaceutical member organizations regarding their perspectives on the FDA’s IND safety reporting guidance Following Robert we’ll hear from Janet Wittes President of Statistics Collaborative who will discuss key challenges from her perspective with the IND safety assessment and reporting So I’ll go ahead and turn it over to Robert – Hi Good morning Thanks to the members for coming Thank you for Duke-Margolis for inviting us to present on behalf of TransCelerate The title’s a little bit out of date for me I formerly was in Med Affairs but now have three groups I’m responsible for at Lilly and the salient one is our global patient safety group but I’m speaking today not on behalf of Lilly but TransCelerate where I’ve been the initiative leader on a work stream that we’re gonna describe for you called interpretation of pharma co vigilance regulations and I’d be remiss to not thank the agency for their interest in gathering perspectives You’ll find this perspective mostly confirms what Doctors Temple and Stein have already laid out for us First a little bit on what TransCelerate is for those who aren’t familiar TransCelerate’s been in existence since 2012 It has a small central group but it’s funded and the work is mostly done by member companies In the interest, ultimately, of improving our effectiveness in developing new medicines but certainly with an emphasis, that’s relevant here as well, on not just effectiveness but productivity for unit of effort that goes into it Since 2012 it’s actually been a fairly successful experiment I know that from the clinical development side of my work where many of the initiatives used by TransCelerate have led to us being more productive together or helping each other together in a pre-competitive sort of way TransCelerate has grown over time and on the far right is now up to 19 member companies with Novartis joining most recently And these are mostly small and mid-sized companies captured (chuckling) in the tiny print up above So a lot of American companies but also European and Japanese and together although it’s a minority, of course, of the companies within the bio-pharmaceutical industry it accounts for about 70% of the volume by revenue across the industry I’d highlight in 2016 and my numbers are doing the same as others that we saw today They’re dropping off the line but BioCelerate was launched That’s of interest to this It’s actually not in the clinical space It’s pre-clinical space but we’re quite excited within the safety group about the potential for the sharing of toxicology data hopefully advancing our ability to, if you will, seek safety by design and try to conduct more of our safety work before its pharma co vigilance

and in the clinic And, as well, in 2017, although not focused on safety one of the trying to find it on here Maybe dropped off from the last version that I saw but working on sharing of placebo data across disease states in which many companies are working in order to help us try to create a more robust understanding of what background is that we might expect which, of course, could be relevant to today’s conversation Turning to the present discussion I was taken a little bit aback, Greg, when you mentioned that there’s experts speaking today ’cause I wanna count myself in this but I led a team of experts and I’d point out that for our team Sata Kamin from Amjen was the leader of the group working on the IND reporting question as well as Rachel Murkada who’s here from PWC who was our capable project manager through this In 2017 I should have mentioned on the last slide TransCelerate decided to build on what we took as member companies as success in the clinical space to look at other areas in which companies were engaged where the methodology of pre-competitive sharing of ideas and solutions would help And of many that they considered pharmacovigilance was the one that was chosen And the work that we did last year is actually I would think of it as a demonstration project regarding the question of whether the TransCelerate methodology would be similarly effective in advancing what we do We found, for example, this project on IND reporting much more work than we expected but we also decided that what we learned took a lot of work that was less than we would have done pursuing accumulatively individually and that was a bit of the intent with this So we talked about ambiguous regulation We, of course, from around the globe get frequent changes in regulation and occasionally there’s ambiguity in the intent More often the ambiguity is in how to translate that intent into actually operationalizing it within our companies And the intent or the value in this is actually hopefully one that’s very tightly shared which is effectively understanding safety and we thought that the efficiency for us would also be a gain in quality because not every company has a depth of expertise in every aspect and so by sharing together we could more adequately obtain that We found that was the case I hadn’t thought about it but we also found that not every company has the same visibility to thinking, or nuance, or clarification that we get individually in conversations with regulators and that was quite useful in sharing together And through that we, in terms of the benefit to us, would be more likely to misinterpret and therefore become outliers putting ourselves at inspection risk And, finally, that as I think Duke-Margolis is helping us with today that perhaps, together, we could focus better on topics that are best raised or bubbled up for discussion with those who do set our rules So this group I was a bit surprised I thought that we would want to get a newly breaking bit of regulation to work together on but as our member companies got together and started working they highlighted the 2015 draft guidance on IND reporting as one where there was a lot of enthusiasm for taking it on which I think you can take as a marker As a marker that companies individually were finding difficult or were finding a bit of anxiety about whether we’d quite gotten it right as the agency had intended This rehashes what you’ve heard already so going on to our method which was fairly simple but as I said took a lot of work Satica and the team from I think about five or six other companies got together and first went line by line through the guidance and found probably would be fun some day to show you a highlighted version ’cause it was embarrassing how many things that we had highlighted that people weren’t quite sure whether they’d gotten right what to do about it But then based on that compiled survey, a lengthy survey, that we sent to our experts at each of the individual member companies to try to get feedback on how we had interpreted, how we had responded, how we’re operationalizing against this They found a number of followup questions and we’ve completed results and now we’re at the stage of beginning to share them And of course have available much more detail behind what I’ll report now which actually is just this slide I’m going to put these are our conclusions I’m gonna have a couple of slides to go into more detail on the second and third bullet point The first was the companies had existing approaches

and have adapted those approaches that I guess you’d think of as the surveillance plan, surveillance team for reviewing, accumulating data with the focus on subject protection here Expectation is that for patient protection before approval will have more comprehensive unblinding and pooling that we would at the point of trying to get emerging events for the sake of the subjects who were in our studies Where we began to find variability was on the point of what were the methods for reviewing the data, unblinding the data, thresholds for reporting, as you heard from Dr. Stein this morning And the anticipated serious adverse event concept has been implemented not by all and somewhat variably And often, as I’ll get to in a second, with a ad hoc aspect at the level of medical review at the final point We have, however, seen across the groups that participated a average of over 50 percent reduction in the number of reports we were sending In the case of Lilly our estimate is 80% reduction however we still have been admonished by at least one division that we are sending too many still amongst the residual And the last part all striving to meet the spirit I’ll come back to that at the very end but I think there was actually, speaking for myself, but I think among most of us pretty strong sympathy for what’s trying to be achieved through these rules To dive into the two that were most challenging This one is on anticipated serious adverse events and documentation within the protocol That’s possible It’s being done by most It becomes more difficult the more comprehensive one tries to be in doing this so, for example, in oncology space we typically would have a fairly robust thinking about hematological impacts for a lot of drugs but there are many, many other impacts that are in the background within cancer that we may not capture there In terms of background rates there are a number of approaches for this I would state that no matter how we get at it it’s perhaps harder than it looks on the surface because even if we had a very strong ability to estimate over an overall population clinical trial populations differ in features that are important Geography’s really subject to who comes into this study is always going to create some variability in it Most companies do maintain, anticipate serious event in both the clinical database and the safety database And finally this last point that most are managing anticipated events with a medical review as part of causality assessment Let me say more specifically what that means in the case of Lilly because I guess it’s a little vague as written here So we will capture some of this at the front in our plan However we also have medical reviewers who will look at each case that might otherwise meld for the confound related to a disease state background or other concomitant treatment that would make that individual event uninterpretable on its face The even that’s not itself convincing will be one that we then end up screening out So this is perhaps not as intended upfront thinking and documentation a priority as opposed to medical judgment on the back end I think reflecting how very broad are the possible events that could happen in the course of the study And on this one the aggregate analysis of safety data This, of course, will speak both to how this is done and also thresholds and we have found this to be a significant area of continuing questions so thankfully we’ll discuss further today Approaches differ such as comparison of rates across treatment arm Polling across studies as appropriate comparison of incidents to baseline frequency I mentioned already some of the challenges with that at Lilly We tend to think of also aggregate being at the level of the product rather than at the level of the trial I think that’ll come up as we talk about DMCs today When running blinded studies most companies perform aggregate analysis however unblinding is done for cause and that would certainly be the case for Lilly I can represent that as well and a couple of comments that might be of interest to it So our approach is that the blinded team looks at the accruing events with the expectation that they would adjust as though any event is occurring only in the experimental arm and therefore an excess of that That, of course, gets more complicated if there’s more than two arms ’cause it’s not as simple as doubling or dose effects wouldn’t be easy to portray in this

And it’s a judgment regarding (chuckling) whether it’s more than the background I’ve been surprised I think across most of our programs it’s unusual for unblinding Maybe one or two per program We had one approval recently in the oncology space which would have an example where this failed We found significant imbalance that’s now labeled and it was an event that is quite often associated with cancer and yet there was an imbalance that we had not picked up in the course of this approach And so before next steps let me get to a conclusion I mentioned two things I haven’t that are sort of general points that are worth thinking about One is about specificity and one is about harmonization So one of the interesting things about the draft guidance that those working closely on it would understand even better than me is that in some cases they thought it was more helpful than necessary in that there’s a lot of differences that are hard to predict on the front end between a given drug, a different given disease state, or a given company approach where specificity could actually make it harder to implement against this or granularity of the instruction On the other hand I think the whole reason we got together is for the other direction (chuckling) where if don’t have a guidance about how best to do it Regarding harmonization I think you could think of it as different levels So, as most are aware, outside US there’s a movement in a different direction and actually sort of pushing companies for more reporting To some degree that harmonization should be just our problem and it is solvable just with more resources in the way that I’ve mentioned We have set for to meet the US intent a separate process, manualized process to pull out events Where it becomes a more common problem is when we get into one of the issues that we’ve teed up today which is things like trial integrity where the harmonization issue either prevents something that’s mandated by a different regulatory group or maybe even more importantly would collide with another tenant that we are trying to achieve together But in terms of our conclusion could not be much more aligned with the four topics that you have on today so we look forward to that discussion and we also look forward, hopefully, to even deeper discussion and work with the agency going forward Thank you That was in case I forgot to say thank you (audience chuckling) – Yeah so I’m Janet Wittes I’m very glad to be here Thanks everybody for coming I feel bad for those up north with the snow I’d like to talk Oh what do I do? Ah that’s me I changed the word from key to some because I don’t know what key means in this context but let me tell you a little bit about my role I think maybe why I’m here and then proceed with some of the issues many of which will be just echoes of what you’ve just heard but echoes from a different chamber So a lot of what I do is sit on DMCs or report to DMCs and my experience there after the 2012 guidance draft came out I would routinely say, either when I sat on a committee or when I reported, that we really should look at the guidance Or I would tell the companies who were involved hey you really should come and tell people what the guidance was and they’d roll their eyes and say it’s irrelevant to you So I’ve given up So that’s one piece The other piece is that I do a lot of consulting with what I consider small biotechs You describe small to me they’re Dr. Baker I don’t see you anywhere Ah there you are What you described as small to me is huge I’m talking about tiny little companies and when I bring up the guidance to them, which I do, they not only roll their eyes they say no, no, no We’re not gonna touch those We’re reporting everything because our heads will be chopped off if we don’t report everything So that’s where I come from You all know what the old problem was Many, many reports were sent to many people to the IRB, city investigators, to the DMC members, and to the FDA of course Few were read Who could read them? It was this great big pile of stuff and even if you could read them they were uninterpretable ’cause they didn’t tell you who was on what So the goal I look at the goal of this set of guidances

and movement toward what we would hope is harmonization as the separate the wheat from the chaff And that’s a complicated thing How do you send fewer reports but don’t miss signals? Sponsors need to assess risk not versus benefit and I’ll get to why I say that That’s for my DMC hat Know what to report and their ultimate goal is to quantify harms So what are the challenges that this has raised? And we’ve heard them already from the previous three First of all fitting an SAC, a safety assessment committee, into an organization that’s already there You’ve heard that from Dr. Baker I think it’s a bigger problem than people may realize It’s easy if you start something To start a new little company To figure out how to do that But to integrate a new system into something that has been working and is well oiled and now you plop something new That’s a huge issue I find, as a statistician, that the guidance doesn’t really have much of a hint of statistical thinking No way of figuring out how to set these thresholds There’s thresholds to reporting but how in the world do you set them? There’s the issue that Dr. Baker raised about the U.S. FDA versus other countries and I think that’s a huge issue because if the purpose of this, I’ll get to it at the end, is sort of a paper reduction act and you have to do one thing for the FDA and something completely different for the EMA and other regulatory agencies that is an increase in burden not a decrease And then, of course, as you’ve heard from all three who should be blind to what? And that is a huge, I think, a really complicated issue Who is blind? At what point is the unblinding become so big that it affects the integrity of the trial? So what are we talking about? And I wanna go through the words because those of you who are here probably know what all these words mean but I venture to say, I will bet money, that most of the investigators don’t know what these words mean and even some of the sponsors don’t So suspected As Dr. Temple said there has to be some evidence to support a causal relationship Not just my kid took a vaccine and became autistic the next day You gotta have evidence of that Unexpected and we’re gonna get to this word unexpected in a bit Not known to be associated with the drug or more severe occurrence of the event or, as Dr. Temple said, more common than expected Serious distinguish from severe Now again I sit on DMCs I know a lot of investigators They don’t understand the difference between serious and severe One is a regulatory definition and one is oh my God my arm hurts Adverse I think we all know adverse means bad and reaction as distinguished from the more neutral word event A reaction to something caused by that’s related to the suspected of at the beginning And if you look at some of the guidances those words are sometimes used interchangeably and that’s confusing for people who don’t know what the meaning was s’posed to have been Okay there’s this definition Again I suspect all of you know it but you better be telling people that you’re working with so they understand the difference between expected which belongs to the drug and anticipated which is a property of the population And then we’ve heard the three kinds of events The single occurrence known to be strongly associated with drug exposure and they are, I think that’s easy as Dr. Temple said, that’s an easy one If it happens don’t agonize Report it That one’s easy Where’s my B? I lost B Well anyhow B is also easy But it’s the C events It’s the events that are No that’s B Okay, sorry, that’s B A few and what does a few mean? There’s my status Is few one, is it two, is it three? What is few? Not commonly associated with the drug

so not really expected with that drug but otherwise uncommon in the population so anticipate unanticipated So you agonize a little bit but ultimately report and I think the importance here here’s this poor little I don’t know whether it’s a rat or a mouse saying I don’t usually volunteer for experiments but I’m kind of a puzzle freak It’s important to know because of this word anticipated what is your population? And then but the real problem or the problem of the day is the type C events Events that occur more frequently in the treatment group than in the control group I’m sorry more frequently in the treatment group than one would expect but the important thing is no single case is a surprise So they may be anticipated events like strokes and MIs in the elderly I hate to say that now as I’m getting into that word myself but infections in kids Those are anticipated because if they were truly unanticipated we’d dump them into type B Now what kind of trials are we talking about with the type C? I think we’re talking about long, randomized long-term blinded trials Those are the ones where these unanticipated type C events will occur with a frequency large enough to notice that one group is having a higher rate than the other And why do they have to be randomized? Why do they have to be blinded? Because these are the ones that are problematic There are the ones we have to make decisions about How do you know that you have an excess of an event, right? So question of the day When should these be reported? And the answer is well that’s easy when the frequency is higher in the treatment than control That’s when they should be reported So when is 15 days? I mean we know if you have a Stevens-Johnson we know what 15 days is That’s from the time that you have event Well what is 15 days for this? It’s the time at which somebody recognizes that there’s an increase Is that right? I don’t know To me that’s one of the ambiguities I’ll say And we have to unblind to know the answer so there’s the rub Part of the evidence requires knowing if the person took the drug because if they didn’t take the drug then the drug couldn’t have caused the event So somehow all this that seems so simple on the surface is actually not so simple When do we? How do we know? How do we unblind? When do we unblind? And when do we say aha there’s an excess we have to report? So who looks at blinded data in an ongoing clinical trial? Now this may be very company specific what I’m saying but, in general, the participants They have to know whether they report something and I think one of the things that really is important to me or it’s important is that we kind of forget that the reporting is coming from them And they only know about themselves, right? They’re blind they don’t know what they are on They know about themselves and what the social media is telling them I have a very interesting case, that’s why I went like this, I had Shingrix last week and I had a huge reaction And I reported to VAERS because I thought okay I’m supposed to report things to VAERS and one of the things it says, VAERS is the vaccine adverse event reporting group, it’s called severe if you change your daily behavior So I look at this and I say well yeah I changed I didn’t go to the gym because it hurt so much but not many people my age go to the gym so I’m gonna say no and I realized as I was doing that that the collection of data from the very beginning is something we have to worry about The investigators they only know about their participants and what they hear from other people and the sponsors know the entire development program so they know a whole lot more, some blinded and some unblinded, but they don’t have access to data from other drugs in the same class Who looks at the unblinded data? Mostly the DMCs and the SACs now

but the DMCs don’t think of risk They think of risk versus benefit The questions that they ask is are the participants safe? And that means is the likely benefit outweighing their likely risk? And therefore they’ll be pretty tolerant of adverse events early on when you don’t see benefit and they’re asking also is the trial still asking an important question? They’re very reluctant to report out and they have a structural limit because they only know about their trial not about everything about the drug, and they certainly don’t know about drugs in other classes What do you guys bring to the table? You bring both blinded and unblinded data You bring other drugs in the class and so when a company gets a question about would you look at neuropathy? And they scratch their head We haven’t seen neuropathy aha Somebody else, some competitor must be having that problem but not much data from large, ongoing trials Okay question How long should participants be followed after they stop study drug? This pertains to how safe things are The rule of thumb is, well, five half-lives the risk is gone but that occurs, that ignores a cascade of events that might occur after people stop taking the study drug which could underestimate harms On the other hand if people drop off of this study because they’re feeling sick and they drop off more in the control group you can overestimate harms So failure to follow a long time can under or overestimate harms So who hears the summary? Who sees what? The sponsor is blind to the ongoing trial Full access to other data from trials of the same drug that haven’t finished and no access to drugs information, ongoing information from other drugs in the same class The DSMB is unblind to its own The study it’s looking at but blind to everything else and the FDA has a different kind of data The question that I think we have to face we’re going to be facing today is how do we marry these three? Are there cultural, operational, and legal, and proprietary roadblocks to marrying this? Even if we succeed will we identify harms more quickly, reliably which is that the ultimate goal? Or is this just an elaborate paper reduction act? Thank you – Okay, great presentations from Robert and Janet and we should have all of you more often because you guys are exactly on time This was wonderful (audience chuckling) So now we’re gonna go ahead and kick off our first actual session on the agenda and I’ll ask the panelists to come up to the stage and then I’ll turn things over to Jacqueline Corrigan-Curay the Director of the Office of Medical Policy Center for Drug Evaluation and Research at FDA who’ll be moderating this session which is about identifying serious anticipated events So Jacqueline thank you – Okay thank you I’d just like to call the panelists to the stage and then I’ll do some introductions and we’ll have our presentation and then our discussion (Greg murmurs) And we’ve heard a little bit about the challenges with identifying the anticipated events which is really at the heart of when we’re going to decide which events are not gonna be reported as single events and so that’s what we will focus on We’ll try and stay in that swim lane I know it’s hard sometimes not to get into other issues about thresholds and things like that So first I’d like to invite you to come up and then I know they’re waiting for Mike I’d like to introduce all our panelists and then we’ll get started So first I’d like to introduce Barbara Hendrickson she’s in the Immunology Therapeutic Area Head

in Pharmacoviligance and Patient Safety at AbbVie I’d like to introduce Chris Granger, I’m sorry Chris, Professor of Medicine and Director of the Cardiac Intensive Care Unit at Duke University Medical Center Adrian Dana who’s the Vice President of Global Patient Safety and Risk Management at Alnylam and Mary Ross Southworth who’s our Deputy Director for Safety Division of Cardiovascular and Renal Products Office of New Drugs at FDA And so as soon as we get a mic, oh we’re getting our mics now, Barbara’s gonna lead off this session with a presentation (Jacqueline murmurs) (Greg murmurs) – Okay maybe I’ll get started The reason I’m presenting today on behalf of AbbVie is because I’m one of the co-leads of our Safety Assessment Committee initiative that’s been going on for the last two years and I just was asked to speak about identifying expected adverse reactions and anticipated events and, of note, the presentation contains opinions and positions that are my own and not necessarily of my company AbbVie As we’ve already heard today from previous speakers anticipated serious adverse events are those that the sponsor can foresee as occurring with some frequency in the study population independent of investigational drug And those are not interpretable on the basis of just one single event and so we can’t conclude with a reasonable possibility that the investigational drug caused that event And these serious adverse events that are anticipated may fall into several different categories including known consequences of the underlying disease Events that are common in the study population independent of the investigational drug administration so events that occur in the general population that may be included in the clinical trials Events that are known to occur with the drugs that are being administered as part of the background regimen They might be required for participation in the clinical trial and then serious adverse events that may be anticipated for a subset of the study population We’ve already heard about the concept of a safety surveillance plan which is described in the FDA guidance which lists these anticipated events that won’t be reported as individual cases to the IND and also may include what we heard already about expected serious adverse reactions that are listed in the investigator brochure So adverse reactions are those adverse events that have some basis to conclude that there’s a causal relationship with the occurrence of that adverse event in administration of the product If those adversive reactions are reported as serious events in clinical trials they may be listed in the investigator brochure and considered expected for the purposes of expedited reporting to global health authorities In order to meet the guidances globally for the SARs that are reported in the IB at this point I think those lists for products in development are limited But for those SARs that are listed in the investigator brochure those should be also noted in the safety surveillance plan and be monitored for cases that are different in nature than what’s listed currently in the IB or are occurring at a different rate than is listed in the IB So with our AbbVie projects we’ve searched for the types and frequencies of SAEs in similar patient populations that are being studied in our clinical trials mainly in comparable clinical trials that have a patient population with a background medication regimen that’s as similar as possible to what’s being administered in our trials Also registry studies and, to some extent, claims databases

So potential sources for the clinical trial data for us have been advisory committee briefing books, health authority product approval summaries, also our own internal sponsor data sources, and publications including registry reports But other potential sources include we heard mentioned the TransCelerate Placebo Standard of Care database, potentially electronic health care records, and claims databases which I mentioned Also you may consider using regulatory reporting systems to identify the most frequent events being reported in particular patient populations And then, if relevant, the prescribing information of background treatment medications being administered in the clinical trial So the participating product safety teams at AbbVie developed a safety surveillance plan for their products which included the expected serious adverse reactions that were listed in the investigator brochure as well as any disease related events that closely were associated with study endpoints that were in the protocol And then these two categories of events would not be reported as single case reports globally but the last, the other anticipated SAEs, would be specifically addressed in the FDA guidance So we searched for public information, as I mentioned, and also our relevant internal databases to identify the most frequent serious adverse events in similar patient populations as would be in the clinical trials that we would be conducting and we took all this data and merged it and one of the approaches we used was to develop a heat map where we looked at the detection of the most frequent SAEs that have been seen across multiple data sources for those patient populations So you can see in this graphic that those events that appear in red or orange would be the ones that were most frequently reported across different data sources So for the AbbVie pilot projects we have tried to do this blinded review using a couple of different statistical methodologies to looked at how the rates in the ongoing clinical trials compare to historical reference rates and when they meet a threshold based both on statistical judgment and medical judgment that we feel the events merit being referred to a safety assessment committee that those events then are referred over to an internal safety assessment committee which operates under a charter and that charter has been developed to mimic, in many respects, a data monitoring committee charter And that internal group then may review that data in an unblinded fashion and make a decision about whether a threshold has been met for IND safety reporting If an IND safety report is recommended then the data monitoring committee would be notified and the product safety team So some of the challenges we’ve had is first of all, I think, identifying the search strategy for the events for the safety surveillance plan Whether to use a preferred term search strategy or grouping of preferred terms or high level terms getting at more of a medical concept level or possibly a MedDRA narrow standardized MedDRA query For example our different approaches we’ve considered Then, based on those search criteria, we try to come up with an estimated background reference rate looking at multiple data sources and looking to see how much variation there is across the different data sources that we’ve looked at Obviously if it’s relatively narrow and the rates are fairly consistent that is very helpful but then we select a reference rate based on our examination of the different sources And then we assess which serious adverse events are likely to be reported in greater than three subjects in the clinical trials depending on the patient numbers and exposures that are planned because, I think, as was mentioned in the previous talk I think small trials where you have few SAEs probably don’t meet this criteria often So we’re looking generally at larger trials that have more serious adverse events that might meet this criteria So then we determine among those anticipated SAEs that would be occurring at higher rates which should be in the safety surveillance plan and some of the questions that we ask during the review of the data are are the events only anticipated for a subset of subjects? For known consequences of the disease

how specific is this event’s association with the disease? Is there an overlap of the anticipated event with a potential risk of the study product? And what is the capability to assess if the event is occurring at a higher rate than anticipated? So some of the additional challenges we’ve had are complexities when a product is being developed for multiple indications So, in this particular case, you need to have multiple lists of anticipated events because sometimes the background reference rate can be quite different for different study populations and this does make the execution of the safety surveillance plan activities more complicated Also we’ve had questions about the granularity by which events should be reported as IND safety reports Typically, traditionally, individual SAEs have been reported at the level of a MedDRA preferred term but what we’ve been doing in our safety surveillance plan is really reviewing events at more of a medical concept level which means that they’re being identified by preferred term groupings, or high level term MedDRA groupings, or possibly narrow standarized MedDRA queries But this does lead to issues with trying to develop the reference rate because not always is the full list of SAEs available in public documents for different sources and so we can’t always be sure that we’re comparing apples to apples as it were And also even if the rate is reported as a grouped term in public documents all of the preferred terms that make up that preferred term grouping are not necessarily provided And as was also mentioned in a prior talk the rates of anticipated SAEs could be influenced by factors that differ among programs like exclusion inclusion criteria in investigative site countries which, again, may not always be evident from public documents So another approach is to use electronic health care records or claims databases to estimate reference rates There are challenges with this approach as well because these databases use ICD coding as opposed to MedDRA coding that is used in clinical trials There’s also questions about the matching of the patient population with those in these databases For example for oncology trials there’s difficulty identifying similar patient populations in regards to tumor histology or greater molecular profile especially in claims databases Also there may be differences in the approach to recording diagnoses leading to hospitalization in these particular data sources compared to clinical trials and events that meet serious criteria in clinical trials generally you’re looking at events leading to hospitalization in these other databases and that may not necessarily match up with events that are being reported in clinical trials as serious So all this can be pretty labor intensive But there are, I think, some opportunities for the future In terms of identifying anticipated events I think we could consider aligning relevant stakeholders and elements applicable across multiple product development programs, in particular, known consequences of the underlying disease, for example, as well as events that are common in the study population Also events that are known to occur with drugs that are included in some of the standard backbone treatment regimens for a lot of clinical trials So these event databases could be updated annually with relevant information with alignment across multiple stakeholders Also I think there’s opportunities for better estimates for expected event rates with collaboration to identify potential data sources early in clinical product development Some of the ideas are advancing the use of electronic health care records which provide more granular information than claims database sources Also potentially developing a standard of care synthetic control arms which use similar inclusion exclusion criterias which are being employed in the clinical trials And then consideration for linking via third party clinical trial patient data and mainly electronic health care records to derive comparative groups So I just wanna thank everybody at AbbVie who participated in these pilot projects and all their work on this and I would turn this over back to the moderator That’s the conclusion of my remarks – Thank you so much – Yeah

– Please join us So that was a great presentation sort of highlighting a lot of the challenges The sources, the issues of coding, the seriousness of the events, all of which are ones that we would like to discuss So I think I’m gonna start We have a couple of questions and I’d like to start with Adrian and then I’m gonna go a little out of order to maybe talk about your experience in this We know there are many anticipated events in a population particularly in elderly populations and what has your experience been in terms of data sources and the ability to really identify a sufficient number of events with sufficient specificity to undertake the way we would anticipate reporting these? – Well I think for us in particular at Alnylam we’re dealing with many of our products with rare diseases and I think the rare disease population really presents unique, unique issues to figure out what are the anticipated events Those are often diseases that are not well studied until there’s a chance to study a treatment and so one of the things that was brought up in the previous talk is the idea that the comparator groups or what has been done in previous trials to sorta sort out the background rates is very helpful but often not fully disclosed so that we can find those backgrounds because if the issue is felt to be in the background it’s sort of not part of the drug adverse event profile and therefore not disclosed So I think that’s a rich source of information that we perhaps have not mined quite as well And then, of course, another You mentioned elderly but I do think that, as a pediatrician, (chuckling) I have to mention that we also look at diseases across an age spectrum and we have to be very cognizant of where to find that kind of background information and what we have these sort of artificial age breakdowns but that may not really be appropriate to what we’re looking at and it may not be accurately reflecting what those background rates are So I think there are really unique challenges We’ve also attempted to use electronic medical records and claims databases and we’ve found that, in the rare disease setting, that often the diagnosis or the coding is very inaccurate So another sort of watch out when you use these databases that you have to evaluate the source of the information before you can actually use it So one of the things we’ve done is try to anticipate this and to set up things in advance For some of the rare diseases we’ve done some natural history studies before we’ve sort of gotten into our actual clinical trials in order to try to see what happens to these patients What is their natural history? What are we planning to affect with our drugs? And so setting up disease registries or a comparison population in advance, if you can do it, can be extraordinarily helpful – Thank you Chris you do a lot of, I’m gonna turn to you, you do a lot of cardiovascular trials so some of your events are actually your endpoints But can you speak of the challenges in some of your populations of these, you know, we often speak of the MI as the stroke as the obvious ones but in determining those other anticipated events in those elderly populations beyond the cardiovascular – Yeah maybe let me start by also congratulating Barbara on a really nice presentation I think that highlights the complexities and the need for a plan that’s specific to many of these variables which I think are so important and we can’t really, in my senses, we can’t really have a highly kind of structured requirements because of the complexity of these things One comment that I might make though is I think we’re really getting into trouble when we use either things like claims data or even electronic health data as a reference for safety for what we’re expecting in clinical trials especially when we’re looking for anticipated effects where there may be only a modest difference in those events according to treatment groups and I think we’re fooling ourselves

if we think anything other than a randomized comparison group will really provide anything reliable to understand those safety events With respect to the outcomes another context comment that I’d like to make is I think the most important thing we can do to understand what the effect of drugs have on kind of the anticipated, common, really important effects is to have large trials with carefully defined outcomes that we reliably determine and we had this meeting with Bob Temple and David Demits and others a couple of weeks ago looking over the experience of the diabetes cardiovascular outcome trial experience which I think is a really nice example We’re not gonna be able to understand the effects on these critically important comment events unless we have large enough long enough trials to assess it And I think the diabetes experience is one that shows how valuable that can be to really understand not only are we have safe drugs but do we have drugs where, sometimes unexpectedly actually, we have beneficial effects on some of these cardiovascular outcomes And then just three other brief comments that I think speak to some of the experiences that we’ve had around these issues One is this issue of structured versus unstructured data collection Active versus passive and I think this is critically important If we really wanna know the effects, for example, on myocardial infarction we have to collect not in an unstructured, passive AE approach but we have to ask did the event occur or not systematically according to a definition? And we need to publish more of this information but, for example, we’ve had experience where when we have SAE reporting and redundant Well redundant SAE reporting and structured reporting that we, for example, had half the number of myocardial infarcations identified through SAE reporting that we have when we ask did or did not these events occur? So I think we’re fooling ourselves if we think through passive AE reporting that we’re collecting accurate information on some of these comment events Another example I have is a DSMB that I chaired where we had a drug that was anticipated to potentially have liver safety issues and when the liver safety was collected in a highly structured way systematically collecting information like liver function tests and looking at it with a drug induced liver injury expert panel we saw a clear and compelling evidence of liver injury which, when we looked at the SAE reporting, which we never would have detected So, again, it’s an example where the data is gonna be only as good as the kind of structured approach to collecting it and I think that’s really important and it alarms me, and I’ve seen this and I hope it’s not happening anymore, when data is collected for outcome events that are important through a passive approach of AE reporting, identifying those potential events, and then collecting them, and thinking that’s a reliable way to collect events – Okay But can I push a little bit more on this? So in a trial how would you, you know, recognizing we don’t want all the events individually how would you go about it as you’re starting a trial to think about what to do about these anticipated events in the population and how to know whether to report them? – So I think the first thing is to collect them in a structured way and, of course, this depends on the drug, and the population, and the size of the trial but any trial in a population that has common cardiovascular outcomes because that is the most common cause of death and disability that people experience, that older populations experience, those should be collected, I think, in a structured way Now if they’re not and sometimes they won’t be if, in an intermediate population let’s say, then I think we still wanna have some type of adjudication of the events about the reliability of how they fulfill kind of standardized definitions for these events as one of the important approaches – Okay Mary do you wanna comment from your perspective? – Sure I get that So I am the Safety Deputy in our division of Cardiovascular and Renal Products so I’m gonna describe a little bit about what it’s like from an FDA medical reviewer when they get these IND reports And I took away some great

takeaways from Dr. Wittes’s presentation because it really kind of highlights where FDA might fill in the holes with these because we have sort of access to lots of information that individual sponsors do not But just for a reality check the way our current system is setup it’s not even real easy for that sort of information to be gleaned across our IT systems right now We are starting some initiatives where we are gonna be able to look at things in aggregate across therapeutic areas, across indications and that, I think, might help fill in the hole of that sort of absence area that has been identified that individual sponsors don’t have access to Medical officers to deluge them with reports that don’t mean anything increases likelihood that they don’t look at anything so people will sometimes just almost throw up their hands and not be able to look at case by case by case and try and make some sort of safety conclusion based on what’s being sent in Medical officers have different review practices Some have lists of safety issues that have traveled over since pre-clinical, since IND opening and they will compare reports to that list They’ll update that report Some are less diligent about keeping a running list so you can see that sort of environment makes it difficult to look at case by case things where they can’t interpret whether it stands as an event or not I think one of our guiding principles today probably is and I think most of the reviewers at FDA probably agree that the sponsor’s in the best place to integrate safety knowledge They have the knowledge from pre-clinical to early phase to ongoing studies They are knee-deep in the program whereas an FDA reviewer will have lots of different programs and so I think we very much rely on the sponsors to do that curation of events to send to us so that we can sorta make sense of them from our point of view I’ve heard the theme of sharing data on background rates for certain things that people often come across in the post-marketing setting where you’re wanting to try to compare what’s happening in the treatment group to a control group I think we have a lot of interest in that as well It helps not in the pre-market setting It would help us in the post-marketing setting and I think we’ve thought about doing this a little bit with Sentinel ’cause that’s our sort of EHR claims tool that we have access to but I hear the pain about that I mean you’re only capturing certain events where patients are hospitalized It certainly does not apply to the gamut of things that you come across in a clinical trial so but I still think that it’s a very rich source of data that we should explore and figure out what can be used and shared I think that there’s an opportunity here to bring lots of people together who have these data I mean everybody’s doing their work in their corner and they probably have nice pockets of information If we could establish a library and refine those event rates as practice treatments, changes, medicine advances I think that would be really, really good initiative to get off the ground for everyone And along those same lines I think it would be important to, and I don’t know if this is the case or not, whether some work has been done to compare how that worked for you You identified this historical rate, or anticipated rate, of MI in this population how did that ultimately end up comparing to what you saw in the trial? And do that for multiple data sources I think that would be very informative and help maybe refine how you approach identifying anticipated events – Barbara you mentioned that, in your pilot, you sort of looked across a number of different sources and as you were doing that can you sort of speak to was there such a variability that it became very difficult to even get to a number that you might a rate that you might think was reasonable and also sort of, in terms of the anticipated events, the cardiovascular or outside the cardiovascular area is there places where it was easier and more reliable? Or, you know, in the oncology area ’cause we have SEER reporting and everything is it more? Can you speak to that at all? – Yeah I work mainly in immunology What can I say about that is that we did try to go first to more clinical trial data sources and the placebo rates often can be highly variable because the placebo arm is often has fewer patients (chuckling) than the treatment arms and so the rates can be quite variable ’cause the number of events may be not so large So that can be a bit of a challenge So in looking at the events also at the active treatment arms especially if there’s been no reason to believe that that active treatment is associated with that event we look at that as well

to try to get an idea of the range in the patient population Also registry studies has been where we’ve gone a lot to look at you know So we have reports from various registries we have access to and so I think that’s been a great source of information for us is the registry data as well so – Can I just comment again about this? I’m a little bit surprised because there’s lots of data that, for example, simply the act of getting consent from a patient is a requirement to be in a clinical trial creates a highly selected population that I think of, an average, once you get consent that patient has half the rate of some of these major adverse outcomes than an unselected population I think we’re totally delusional if we think that we can use an unselected population as a reference rate for patients in a clinical trial and I think we have lots of data to support that So it may be relevant to know what that background rate is but we should be very careful in suggesting that that’s an appropriate rate to compare adverse events in a clinical trial – Well I think what we try to do is look at a variety of data sources Clinical trial data is our first preference, if we can, because, I agree with you, it’s the most aligned with what we’re currently doing looking at the rates in another clinical trial But I think what we look is across various data sources and see how close are the rates Actually, you know, a lotta times the registry rates and the clinical trial rates aren’t so far off from each other in our experience for some of the things we’ve looked at They’ve been pretty close sometimes so if you see consistency across multiple data sources then that gives you more confidence You know that you’re probably on the right track but if you’re seeing big discrepancies I agree with you Then you have to think about like why is that, you know? And some of the problem is with the clinical trial data sometimes is that the number of events are so small that there’s a lot of variability You need to get a much bigger database of studies pooled together I guess to really look at a rate that is more accurate So that’s some of the challenge there for us – It is a challenge and I would agree there but two comments One is that sometimes that clinical trial population is exactly very close to the clinical trial population you’re dealing with So it is a good reference rate for your particular trial to see if your rate is different The other thing is we’re talking about sort of signaling really within the earlier points of a clinical trial and I think, at that point, even if your reference rate is a little low it’s okay because you’d rather be on the conservative side and pick something up earlier than have a more general population rate where you may miss that signal within your clinical trial So an underestimate in the background rate might be acceptable in this particular case – One other thing We will get complete NDA packages where the trial’s done and you see these imbalances and you’re still at (chuckling) I’m not sure like why would this drug cause this? So there’s a lot of uncertainty once those data are all put together So to make these kinds of decisions before that even happens I think you have to understand there’s gonna be a lotta fudging and judgment and (lips clicking) eyeballing it So until we get more precise background rates from whatever source we determine is the best one – I guess the other thing just before maybe we open it up to questions we had talked about, I think, Bob Temple said there’s two easy and two hard and the other hard is sort of the events that might have been identified or anticipated with your drug but whether the severity or frequency is elevated Is that also a challenge or, in this insert of what we’ve been asking sponsors to do in this guidance or is it really the anticipated that is our true challenge? – I think that’s a critical point and I go back, Salim, we were talking about in developing fibrinolytic therapy there was a clear and we knew the main trade off for the reduction in mortality related to the myocardial infarction itself was intercranial hemorrhage and it was an anticipated event But it was the critical event to be carefully monitoring to assure that the drugs had both safety and an appropriate trade off And what we tended to do was to have the DSMB very carefully

tracking that and even with pre-defined guidance on what might be appropriate trade offs but I think sometimes the anticipated events can be sometimes they’re the most important in terms of assuring ongoing, accurate assessment of safety and the trade off of safety and efficacy I mean maybe that’s an example where the DSMB is the key mechanism and we don’t need to involve the guidance that we’re talking about – Anyone else? We have a couple of minutes I don’t think we’re outta time And we’d be happy to take some questions – Maybe that microphone – Yeah I think there’s a mic I think they’re just about to turn it on for you – [Audience Member] I mean the discussions here show how difficult it is to estimate background rates but aren’t we creating a huge monstrosity for nothing? And what I mean by that is we have a superb methodology called randomized trials and we’re now going to the worst methodology called historical comparisons Aren’t we being foolish with this step? Shouldn’t we just step back and say really no matter how hard we try we can’t predict background rates? Let’s even take conditions in which certain outcomes are the primary endpoint like mortality in a cardiovascular trial Our prediction of event rates for which there’s a ton of data is not that great So how are we going to then take adverse events which haven’t been systematically documented in the literature and come up with realistic rates? And Chris made a very good point The people who you select in a trial isn’t your average registry or your average clinical thing I personally think this whole thing of unanticipated excess rates should be thrown out of the window completely In fact, down the toilet It should be flushed down (Jacqueline laughing) Instead let us focus on the strength of the randomized trial which is that randomized concurrent comparison and in every drug we’ll have side effects I remember my mentor, Richard Peter, telling me if a drug doesn’t have side effects it won’t have effects So if that is the case you have to balance side effects with effects every time I mean you can’t just put side effects alone and it goes back to the thrombolytic trials There was a clear excess in intercranial bleeds They emerged early but the reduction in mortality took longer to emerge when the GC trials If you only did this and looked at only safety you shut down those trials early and we’d miss the benefit So I have two thoughts for discussion Let’s get rid of this thing of an anticipate unanticipated higher rates of events That’s just a fool’s game and it’s just a create work situation If you want to politically create work let’s do that for unemployed safety reporters (audience laughing) But the second part of it is I think looking at non randomized data loses the strength of what we’re doing and we should look at it in the context always of potential benefit I’ll leave that – So does anyone wanna make a comment? (panel laughing) (audience laughing) – Do you mean throw them out ie we don’t need to even consider reporting them in an expedited fashion? I’m just wondering how to– – Absolutely not 99% of (murmurs) – So you just keep ’em in your database and– – Let it be collected – Right but don’t make any decisions along the way – [Audience Member] Fewer with all the criteria, seriousness, all that kind of stuff but also compare it across the two groups– – Which is your best comparison? – [Audience Member] And they weigh it against your benefit There are examples of (murmurs) (audience member murmurs) There was these higher rates of reports of cavernous thrombosis and the regulators, not the FDA to their credit, but some European regulators demanded we unblind the data The DSMB chair Peter Snipe was tough

and he said no and he managed to persuade them What that wasn’t telling us was we had a 50 percent reduction in serious bleeds including intercranial bleeds and nothing that we had anticipated was a 25 percent reduction at that time So if we had just reported cavernous thrombosis although the excess was 10 to one on a rare event, and it was real, we would have shut the trials down missing the substantial reduction in bleedings (audience member murmuring) Do you want me just to start all over again? (panel laughing) (audience laughing) – As you’re walking back I just wanna make one point though I mean one of the reasons that we’re talking about this is this was a method potentially not to unblind everything to make your comparison So there was sort of two approaches One you unblind and make that comparison between these events or you identify a rate upfront and only unblind when you exceed that rate So this was a way of trying to balance our need or our desire really to have more reporting that was more informative without jeopardizing what we recognize the integrity of the trial So we need to put that in context It’s not that we don’t think randomization or we’re saying randomization is great we just don’t know what to do with trying to unblind to get at this issue – [Audience Member] I think the solution you ended up was the wrong solution unfortunately I understand the intent and what I think you should do is the unblinding should be done not by the sponsor, not by the investigator, but it should be some form of regular safety reporting and ideally by the data monitoring committee because they’ve seen the totality of the evidence It could be that one adverse event is going up but a whole lot of things are going down and they’re putting it in the context of the efficacy Now I think you’re losing the strength of randomization You’re losing the totality of the evidence which is the context with the current method and it’s a huge work creation effort that has now let your huge machinery and machineries are not just where the FDA is drowning in paper The IRBs at each center are drowning in paper and our IRB and all the IRBs in Ontario, in Canada, have said we don’t want to review them But if you want us to review it it’s $300 per case that you have to review It is a way of pushing back and saying this is nonsense and silly We’re not going to do silly stuff and the other part of it is in the study that we are coordinating around the world called Compass we have to send monitors to make sure that the investigator sent it to the IRB and there is documentation and there is concern that if we don’t document when the auditors from FDA come we will be cited for lack of oversight Now this whole rigamarole is useless and instead let’s just do decent, big trials and get a good, sensible group of people to review it That’s the easiest way to do it But yet– – I think there is a part of that in what is being described here and that is the use of the safety assessment committee that can unblind data, that doesn’t make independent decisions about shutting trials down but they work with, I think your slide showed this, they work with the DMB and so that they’re not just saying oh my gosh we see cavernous thrombosis shut it down That’s not the right decision to make The right decision to make is the good of the people that can look at both sides, look at benefit and risk, and say do we need to reconsent? I mean that’s not a great thing to have to do but do you need to change monitoring, collect more information on the events but I think I understand your discomfort with using historical rates because they’re really all over the place and I agree with using the randomized data if you can and I think that model is built in to some of the things that have been proposed here – [Audience Member] I think the whole point of the changes was to do exactly what you want What we were getting was, just as an example, every heart attack would come in and it would go to all the IRBs, and go to all the investigators, and in a population over 65 that’s not very helpful until you break it down and look and see if the rate is higher So what it calls for is not reporting those until and unless there’s some reason to think the rate is higher and, as was discussed, that could be because the overall rate’s higher than you expected but that doesn’t mean you should then report it That means you should then unblind it by the data monitoring committee or a safety assessment committee and take a look The trouble with, I’m not sure I

understood what you wanted to do, but this is something you do during the study It’s not at the end of the study At the end of the study you look at all that stuff It’s trying to anticipate that the drug is doing something bad That’s sort of a responsibility you have to the patients So you need to get a smell early when it looks unbalanced and how to do that that’s one of the main things we’re talking about How do you know when you’ve passed the threshold or when you should look? I guess one possibility is to take almost any serious adverse effect and have a data monitoring committee look at the rate of all those unblinded and that’s one perfectly plausible solution And it’s not incompatible with what we suggest but that’s a lot of work and we also, by the way I just wanna mention it, it’s not necessarily only one study we want people to look at We wanna look at the whole database So if there’s five studies in depression going on we want them to look at all of it and how to do that is the problem But not sending ’em in before you do that is what we trying to (chuckling) that’s what we were trying to accomplish The very thing you’re worried about so– – Thank you – [Robert] Thank you Robert Baker, from Lilly, I have, I hope, a shorter question (audience laughing) (panel laughing) which is I’d be interested in the panel’s view at a bit higher level Are we aligned on exactly what we’re trying to accomplish because I think that can help us to narrow or focus the tactics and specifically I asserted when I spoke that this is about subject protection within our trials, within our programs whilst maintaining the integrity of the blind and I think if that is indeed what we agree on we can have tactics that might be more manageable but I could also envisage that the agency might be interested in maybe lower severity things and more contemporaneous as you’re looking at other programs so I think it’d be useful to know whether really if we get this right for the kind of serious things that would really shift the benefit risk for our subjects that we’ve done enough – Well I’ll defer to Mary but certainly this is all about serious adverse events and sort of looking and getting information that is relevant to the safety of the patients earlier if we can And when it comes in as sort of a single event and that analysis is not done and it’s not done across the development it’s hard for us to get that signal early Mary if you wanna– – I mean of course the emerging principle here is to protect patient safety but, in order to do that efficiently, you have to put some thought into what you think is important I mean I don’t think anybody is interested in shutting every trial down for every SAE and a lotta time you don’t need to do anything It’s something that you might have thought of from the pre-clinical model but just sending everything in without filtering it down and trying to come up with a list of these things is not a good way to protect patient safety ’cause it’s just paperwork – [Robert] Yeah just to be clear maybe more clear on the question sorry It’s that we will, of course, when the trial ends we’ll have data, we’ll have the randomization You’ll see that I can imagine tactics that we could use in unblinding groups or even the DMCs that’ll keep pursuing this at a subject level but I guess how high is the priority that you’re seeing that extremely quickly versus that we’re protecting the subjects? Because if it’s the former then I think that we’re gonna have to cast a wider net – I’m sorry is the former that we’re seeing it very quickly or that we’re protecting subjects? – You – I guess it’s hard for us to divorce those two because I think that’s what the safety reporting is about It’s just that it’s both of us working together towards that same goal and trying to get there So we’re not asking for these early just to see them early We’re asking them early ’cause we wanna make sure that we’re doing the right thing for patients and for the trial And I think, in many ways, the thought was we’re not asking for them early with this guidance and maybe that’s not clear because we’re not asking you to report them as they’re happening if they’re anticipated We’re asking you to look at them and keep monitoring them and report them when now we are seeing a difference that we need to pay attention to because it could be a patient safety issue So I think that’s the goals we were trying to get at Maybe we have not done that and that’s sort of why we’re in this room today to figure out how to get it optimized – And I’ll just reiterate I think it is really important that there’s somewhat of a shift to the expedited reporting issue to, in these anticipated events, the topic of this– – Yeah – Panel that there’s more of a focus on that On the common serious consequences that our patients are experiencing

or events that our patients are experiencing and making sure that signals aren’t arising that there’s a substantial difference in those that we should be identifying for patient safety – Yeah I think one of the questions I’ve had is exactly what was brought up is for expedited reports that when you’re looking at these events that may not rise to the level of stopping the trial or even a protocol amendment with different exclusion criteria or a different safety monitoring Where is that threshold for just informing investigators there is a possible causal association with this product that they need to be aware of and take into account when they’re dealing with this individual patient before them? And with their constellation of medical circumstances versus rising to the level that you think you’ve gotten to the point that you need to modify the trial which we know, you know, at that point you really do need to look at the benefit versus risk sometimes that the data monitoring committee is a better position to do – I see what you’re saying This sort of threshold issue in terms of– – Yeah – How fast and when you bring it in and what we’re recognizing I know we’re a little over time I’m gonna let two short questions maybe? (Jacqueline laughing) – I’ll keep it short, right Bill? (Jacqueline laughing) I think we’re a lot more aligned than the discussion sounds here but we have to remind ourselves we’re talking about anticipated events primarily so events that where you need an aggregate analysis You have some information here At Merck we try to remind ourselves periodically that we’re looking for the wolves not the chipmunks So, as the reviewer pointed out, even at the end of the trial or the end of the program where you have all of the data clean, locked, integrated together it’s still hard to find the chipmunks So we’re looking for the wolves here and for the discussion that was brought up before about registries maybe not being useful or some other data source maybe not being relevant it depends Sometimes it may be useful I think you should gather all of the available information taking into consideration the relevance to the ongoing trial So for cardiovascular endpoints that have to be adjudicated maybe it’s not useful to look at a registry but there’s a lot of endpoints where that’ll give us a more robust understanding And we shouldn’t overpromise to anybody here what we’re trying to do, right? We’re trying to find these wolves again in ongoing trials So if we don’t have relevant information about the background rate then we can’t take advantage of some of these more quantitative methods but if we don’t have a lot of understanding then it should be a little bit more difficult to cross some threshold anyway So it’s not really a question but I just wanted to emphasize we have to keep reminding ourselves what we’re really trying to do here We’re not trying to find a difference in a very rare event That’s the type A and B events The single or small clusters – And final comment – Hi, Asif Hark from AstraZeneca So my question or thought for an alternative approach Where would the sponsors come to the FDA for guidance on a suggested plan for investigation study? Then also discussions on what events should be reported or what events should not be reported because they might be endpoints? Now I think we can define that further because the common question that everybody’s talking about is incidence rates So perhaps there could be a discussion or some guidance from the FDA at that point that these are particular events and these are the event rates that we have seen and perhaps if the FDA can maybe just add one more (chuckling) thing for them to do is to figure out what a line on a better incidence rate based on their global knowledge and then provide that to the company saying this is your threshold This is what we want you to do because currently we’re all searching for the same question Where’s the threshold? Who defines it and which data source do we look at? ‘Cause every data source we look at is gonna be different and everyone looks at it from a different angle And again when you look at a patient population you’re talking severity, grading, cardiac failure, et cetera, et cetera So everyone would have a different set of brackets to do So is that something that can be thought about when a sponsor comes to you at the design phase that you with your background knowledge and all the background information that you have set that standard at that stage? Because then that will really help us also to work from a standard rather than build our own standard which will be different for everyone

– I can tell you that we do not have the magic background incidence rates. (chuckling) But I mean we do have access to different things that our people have access to and so I think we would be certainly willing to work with you when you came in and said you know these five events we think are gonna happen in our patient population based on x, y, z Using these data sources our range of expected incidence is bla, bla, bla I think we would welcome working with you about sort of looking at how you came up with those rates and agreeing or disagreeing or– – [Audience Member] And also publish them so everyone has access to– – Well that’s what I mean A library of background rates would be lovely I mean I think these are– – Because we all struggle with that whenever we have to look at an aggregate report Which database do we look at? Do we look at your Solaris? Do we look at this? Do we look at that? Thank you – Thank you We’re a couple minutes late but we’re on break now until sorry one second? (Jacqueline laughing) – One, one second I think those discussions are going the wrong way We’re going to create a library We’re going to give you a library and we know it’s terribly unreliable Get rid of this idea and stick to the randomized data The request was not sensible I’m not sure the response is going to be useful – Okay So we have both views on the table here (panel laughing) Well thank you everyone, panelists and presenters, for a great discussion – Great thanks Jacqueline and all of the panelists (audience applauding) We’re gonna go ahead and take a 15 minute break or actually 10 minute break We’ll start back with the next session at 11 o’clock Thanks (crowd murmurs) (crowd murmurs) Okay we’re gonna go ahead

and get started

Can I ask you all to please make your way to your seats? (crowd murmurs) Welcome back to our second session focused on maintaining trial integrity with unblinding and clarifying the flexibility and choosing the entity responsible for aggregate analysis in IND safety reporting We’ve heard quite a bit of discussion already so far about the responsibilities of these various groups and how to do this I’d like to now introduce our moderator for this session Yeah – We’re not in a– – I know that. (chuckling) (crowd murmuring) (papers rustling) (crowd murmuring) – Okey dokey – Okay I’ll go ahead and introduce our moderator for this session is Professor and Associate Chair and Director of the Collaborative Studies Coordinating Center at the Department of BioStatistics at University of North Carolina Lisa LaVange Lisa can you join me up here? and then I can ask the panelists to go ahead and make your way up on the stage as well Thank you – Thank you Greg I’d like to introduce our two speakers and panelist We have joining us on stage Ajay Singh Ajay is the Team Leader in Safety Evaluation and Risk Management group at GlaxoSmithKline Our second presenter Tom Fleming Tom is the Professor and former chair of the Department of BioStatistics at the University of Washington in Seattle And we have five reactants Four in person and one hopefully joining on the phone We have Amit Bhattacharyya, Vice President of Biometrics at ACI Clinical We have, on the phone, if she can join us Erin Bohula, Associate Physician in Cardiovascular Medicine at Brigham and Womens Hospital, an instructor of medicine at Harvard Medical School, and an investigator in the TIMI Study Group We have Elissa Malkin, Medical Director for Parmacovigilance at BioMarin Pharmaceuticals and we have Dave DeMets at Meredis, Max Hailperin and Professor in BioStatistics and former chair of Biostatistics and Medical Informatics at the University of Wisconsin at Madison We have, finally, Mat Soukup from the FDA Matt is Deputy Division Director for the Biometrics division that oversees pre-market and post-market safety evaluations in the Office For BioStatistics Center For Drug Evaluation and Research So we have a really wonderful distinguished panel We have a plethora of statisticians and we’re going to get into more of the nitty gritty of how we can figure out this more efficient and effective IND safety reporting This session has primarily two objectives We want to look first at the challenges related to unblinding data to determine if there’s an imbalance in these events between treatment groups as we’ve discussed in the first session And also whether that could be maintained by a group other than the DMC Whether the trial integrity could be maintained in spite of this unblinding and then we wanna talk a little bit about the safety assessment committee How it differs from a data monitoring committee How those two might work together if they are both maintained or in a particular trial So Ajay is gonna start off with a presentation

from the industry’s perspective for us – Hi Lissa this is Erin Bohula I just wanted to confirm that I’m on – [Lisa] Great, thank you – Thank you Dr. Levante appreciate it So I’m Ajay Singh from GSK and Dr. Vertes I’m going to have to report your AE to our vaccines office so (chuckling) hopefully it’s not serious but we’ll touch base later on but So again also my appreciation to Duke Margolis and FDA for really allowing I think an open discussion about the extent and proposed safety oversight guidelines in the context of the IND and DSR reporting structure I think it’s fair to say that all of us whether we’re in academics, the regulatory bodies, at this committee, or industry sponsors are facing mounting challenges for the safety oversight And I was sort of thinking about this and maybe we should celebrate these challenges because now with the advent of targeted therapy and immunotherapy for cancer and individualized therapy for rare diseases we can really study patients we couldn’t study 10 years ago, 15 years ago and make a difference in their lives But these patients, as you all know, come with significant comorbidities Now our littany of medicines which we would exclude before and many will have underlying and open compromise which makes the evaluation and assessment of causality of taping emerging adverse events much more complex especially if you look at some of the ones listed This is a very small group obviously Ones like the enzyme abnormalities or renal function problems which have a multi threat to our ideology Now the last one is close to my heart I am a nephrologist and it’s one of the hypotheticals that was presented in the guidance document So, if I could, just ask for your forbearance here and just present somewhat an extended to this hypothetical that was in the guidance document A not unusual scenario we may face, all of us I think, would be the cluster of five events of acute renal injury or acute renal failure among 200 patients in a cancer trial Now hypothetically what I’ve said is that in a non clinical evaluation at high doses this drug was associated with evidence of tubular injury Each of those five cases came in unrelated because there was another risk factor Dehydration, diarrhea, or previous exposure to nephrotoxic chemotherapy Not uncommon in this population Therefore when we do the pharmacovigilance assessment what we would likely say is that the cases had other risk factors and in the vernacular of the pharmacoviligance world we’d probably say that we confound it with apologies to my epidemiology friends here But I think the caveat here is that it may be precisely these vulnerable patients who present first, who present at the lowest exposure, at the lowest dose, if there was an effect on TFR, ie an elderly dehydrated patient may be much more likely to manifest the clinical complication of tubular toxicity than a 25 year old that’s well hydrated Secondly even though each case was deemed unrelated, I think all of us would have done that, we as a sponsor and at the regulatory bodies have sight of the cluster of events And thirdly we had a very, very nice discussion that even though if I review the literature renal compromise is very common in the cancer population and historical controls serve, I think, as a good barometer but really the best comparison would be the drug post contemporaneous control I think we all realize that And I must confess when we have these type of discussions which you do all the time at sponsor level and I’m sure at the regulatory level I find myself muttering under my breath God I wish I knew the drug allocation just to be reminded by my more sagacious statistical colleagues and clinic colleagues that we need to maintain the trial integrity And really that’s the question for today in terms of what are the challenges of maintaining trial integrity with now routine unblinding of safety data? Or more specifically the SAC concept and to have them review SAE committee that can review unblinded SAEs across the development portfolio within the context of the INDSR evaluation process Now this slide we’ve talked about before Not to belabor it but we’ve made a distinction here Events which can be reviewed in isolated cases versus those that clearly require aggregated review

and amongst the latter I’ve just put it into the categories that have been discussed Emerging new signals That is previously unexpected events or perhaps more challenging concept of higher than what we expected rates of already identified events that were suspect or anticipated events And because it’s not a cohort group C that we are calling it requires compares into either historic or contemporaneous controls the proposal is for a routine unblinded review by the oversight committee the so called SAC Which gets me to really the first challenge If we constitute SACs across the developmental portfolio to look at all serious adverse events that will, for many companies, require a lot of data being unblinded And a lotta colleagues that will have at least sight of the unblinded data whether it be in the aggregate level or at the patient level From the data managers to statisticians and the oversight committees And this does reflect truly a paradigm shift in how we do in stream safety analysis Not to say it’s anything wrong but it does suggest for us a big change in how we would do our in stream analysis Now one can imagine that these colleagues on the oversight committee will likely be looking at a lotta signals sometimes with small numbers and sometimes with incomplete data And, hence, on occasion and it’s possible even more so not on occasion they may require supplemental information So let me just go back, if I could, to these hypothetical cases of the five reports of acute renal failure clearly an event that we’re all worried about And let’s say there is an imbalance of five versus zero, four versus one Whatever we think is a bar that we think is at least meaningful After data cutoff and after formal unblinding the study team would have the benefit of contextualizing this SAE imbalance within the broader framework of trends in AEs that are relevant as well as trends in reps Creatinine trends whether they be summary data or they whether they be shift tales So depending on the clinical importance of the emerging signal it may be possible that the SAC committee may request further unblinding and I think it’s possible that unblinding may extend beyond just SAEs to evaluate an emerging signal Now I don’t think anybody in this room or any of us would argue that the wellbeing of the patient supercedes any other consideration I think that’s what we’re all here for but I think we would all agree and the last conversation I think was very helpful is that these considerations are often highly nuanced, right? And the decision to unblind beyond SAEs for a committee can be highly challenging ’cause they’re likely to be looking at multiple imbalances, small numbers, many of them may be clinically important, and they may have incomplete data So I think this is another one of the challenges that SAC will likely face as they look at interim data with incomplete information of potentially really important events where we don’t have a lot of the information and we’re only looking at SAEs I think Dr. Granger pointed out one case where it was the adverse events in which the liver signal came out I think it could happen not infrequently So start taking a step back Irrespective of the amount of data we unblind if we have appropriate barriers in place trial integrity should not be compromised So what have we got with the IDMC model where everybody is external? That’s possible and certainly there is precedent for companies having independent, parallel, unblinded review in certain situations by a separate group However, the caveat here is that, at least at GSK, we have very small numbers there doing that and therefore if you have small numbers of colleagues that are sequestered it is fairly easy to maintain significant barriers The practical logistics will get much more complex as the number of SACs that attended the unblinding grows So I think, just to summarize, the effect of it is very good depending on the size of the company Would depend on the structure of the barriers but also the constitution of the committees

with external committees being the most robust in terms of barriers but I think there are some real consideration, and these were in the feedback to the docket, of why internal members may actually be beneficial or so These committee will be looking at a lot of signals and will have to interpret it in the context of the population but also the full non clinical data which may help to assess biological plausibility And if data pooling is required they will need a very great in depth understanding of the design and the results of all the previous studies So in many instances internal colleagues may be well suited to sit on these committees and secondly we are recommending that the SAC provide a recommendation on INDSR submission which is a major regulatory role with intended effects on protocol and as well as ICS And sponsors may not feel it appropriate to completely relinquish these responsibilities Skipping over this I know I’m running close on time but one other thing I would like to just point out Reading the response to the docket a recurrent and common request, I think would be the best way to put it, from sponsors was a request to perhaps maintain some flexibility on the need for an SAC and a remit of a SAC based on the population studied, the stage of development, and any prior safety signals And I think this is in recognition of the fact that most companies already have robust processes So I’ll take the last 30 seconds and just describe our process at GSK and, please, this is in no way to say this is an ideal way or ideal process It is one paradigm So the core of the safety construct really are the multi-disciplinary safety review teams which operate with guidance from and oversight by several other committees all under the governance of our Chief Medical Officer, CSO, and global safety boards The safety panels are advisory In the spirit of the SAC we have three possible oversight committees which can do parallel unblinded review if required However, these are not mandated with the exception of IDMCs which are almost always required but pivotal for registration studies The other two ISRCs and DRCs are constituted for, almost always for, early phase products and with advice from the governance body and IRD leadership and they can do parallel unblinded review However only a minority of our projects that do not have IDMCs operate under an oversight committee Regardless the responsibilities INDSRs does stay with the core safety team and they are encouraged to seek guidance from other panels And one of the other panels that we have, our formal safety panels, of which we have a lot of them I’ve just listed a few of them These are formal panels that generally have internal other colleagues but some external colleagues that are basically available to review emerging signals and help out the safety colleagues on an ongoing basis So just a one possible paradigm where there is a flexible approach and with that I think I’ll turn to back to Dr. LaVange I wanna thank everybody and really my genuine gratitude to the FDA and to Duke for facilitating this conversation and to our esteemed panel for providing for further context so thank you – Thank you VJ and now Tom if you could provide us with your presentation Ajay did I just call you Ajay? – It is fine, yes – Sorry – Thank you Lisa So, as Ajay and Lisa have already pointed out there’s really a dual challenge here and one is to get the timely and reliable IND safety reporting but to do so in a way that protects the integrity of ongoing trials I’m grateful to insights from many folks Discussions I’ve had with Lisa, and Dave Demets, and Janet Wittes, and many others on this issue and, in particular, I think Janet’s article on the FDA final rule was very enlightening to me As that article emphasizes, as is in the guidance, there are really three simultaneous characteristics that we’re looking for for these key events That they’re serious, that they’re unexpected, and then the art of this is there’s evidence to suggest a causal relationship That they’re not just adverse events That they’re suspected or certain adverse reactions And so I found it very useful that the FDA guidance has these categories A, B, and C and Janet’s article talking about the kind of evidence that we would use for each of these to understand the drug’s causal relationship Categories A and B

If you have a progressive, multifocal leukoencephalopathy you only need a single event Probably don’t even have to unblind it If you have an MI in a young patient you probably only need a few occurrences To me this isn’t controversial in this setting in terms of process Yes we would have medical monitors Yes they would be monitoring continually Yes they would unblind as necessary to identify those cases The challenging issue, as others have said here, is category C These common events in an exposed population, MACE events, in populations with moderate to high cardiovascular risk There’s no question, as others have said, that to understand causality here you need to be able to compare the frequency in a drug group against a comparator group And as both Chris and Salim have pointed out that really needs to be a randomized trial and it needs to be of sufficient size This could be done by a data monitoring committee It could be done by a safety assessment committee FDA talks about these safety assessment committees as achieving timely identification reporting of CSRs So I’m gonna talk about some important operating procedures of that SAC It’s gonna be certainly important that it has insights from unblinded data from both completed, and here’s the critical issue, ongoing trials And it’s important to recognize that in ongoing trials we already have multiple oversight bodies that have overlapping ethical and scientific responsibilities In a clinical trial sponsors and investigators have key decision making responsibility for designing the conduct of the trial Caregivers for the oversight of their patients IRBs and regulatory authorities for approval of the ethics and the science and the ongoing monitoring of SUSARs Data monitoring committees have a partnership that’s very much partnering up with all of these other oversight bodies but a distinguishing characteristic we’ve always said about DMCs is they have sole access to the emerging data in an unblinded manner Why is that? As we look at the mission of the data monitoring committee always I think of it as first and foremost safeguarding the interests of participants but also enhancing trial integrity and credibility I call it individual ethics and collective ethics To achieve this mission there are certain procedures that are key One is maintaining confidentiality to reduce pre-judgment Another is to obtain unbiased judgment through having a well informed and independent committee and I think this motivates fundamental principles then for our data monitoring committees For our independent and unbiased judgment they should be multidisciplinary They should be independent For reducing risks of breaches of confidentiality they should have sole access To provide some additional motivation for this sole access there are numerous examples from our experience One recent example, to point this out, is the LIGHT trial which was looking at Naltrexone and Bupropion recently called Contrave and it was an obese, overweight subjects with cardiovascular risk factors The objective of this was a cardiovascular safety trial to see if we could rule out an excess of a 40 percent increase in MACE Cardiovascular death, stroke, and MI But the trial was also designed to allow a quarter of the way into the trial, after 90 events, the ability to see if we could rule out a larger margin If so that would provide the ability for earlier marketing approval Around the time this study was started FDA had a Part 15 open public hearing on August 11th, 2014 to talk about how we proceed in such studies and there was strong consensus to the importance of proceeding in this fashion There was also consensus among academia, industry, and regulatory authorities that if you, in fact, were releasing the data at this early point for regulatory approval that it was important to have proper firewalls That you were releasing it to those members of the sponsor who had regulatory reporting responsibility They were firewalled away from the study team They were firewalled away from the rest of the sponsor and when the FDA did their summary basis for approval they would be very sure to simply point out that the data had ruled out the larger margin but to keep, in essence, apropos is for the trial to continue So, in essence, what happened in this study? That first analysis was done in November of 2013 The data, in fact, provided considerable evidence that Contrave was ruling out a doubling The monitoring committee said release the data to FDA per the data access plan The study continued The FDA did, in fact, approve in 2014 It was 15 months later when we met again with a second quadrant When the second quadrant was available the data monitoring committee recognized that the second quadrant was very inconsistent with the first quadrant You had instead of having 22 fewer deaths, strokes, MIs you had 17 in excess But the totality of the data were still consistent with ruling out a 1.4 margin The DMC said continue the trial

but we said thank goodness we’re following a process where these interim data are not widely disseminated Well the day after this, in fact it was later of this day, the data monitoring committee, and the steering committee, probably the FDA was shocked when we realized that the sponsor had actually released the first quadrant in a patent filing The steering committee realized a few days later that this breach had significant impact on the integrity of the trial that they recommended trial termination The data were then reported a year later in Jan of 2013 There were 64 percent of the total planned events were in hand There were key insights here and the first is the unreliability of interim data It’s very apparent that these interim data were very inconsistent with what the final or more complete data showed And the second is that breaches in confidentiality provide potential for dissemination of misleading results and risks for irreversible compromise in the integrity of trials These interim data certainly were very misleading and their release did, in fact, compromise the ability to successfully complete the trial There has been considerable consensus The DAMOCLES group through an extensive discussion and research into DMC best practices concluded that there’s near unanimity that the interim data and the deliberations of the DMC should be absolutely confidential and breaches of confidentiality should be treated extremely seriously And there are formal statements of concordance with NIH, WHO, EMA and FDA So, in fact, however sponsors will need to have access to interim data on a need to know basis So medical monitors certainly do have a need to know for their category A and B continuous monitoring For category C where you would have an SAC that, in fact, is going to have access to unblinded data that would include ongoing trials does that, in fact, need to be external to the sponsor? There’s a key principle and that is the sponsor’s insights from such access to unblinded data from ongoing trials should only be shared with DMCs, regulators, and others who also have that need to know to address individual and collective ethics And so as we’re looking at these important operating procedures this principle leads to another insight here That is those with access to unblinded aggregate data from ongoing clinical trials should be firewalled away from the study team Is that, in fact, gonna be effective? Can we effectively firewall? Especially in a small sponsor So it motivates limiting access to unblinded data, wide access, in ongoing trials to SRC members who are external to the sponsor which brings an interesting thought If collectively the SAC would have access to unblinded data from completed ongoing trials could there or should there be a subgroup of that SAC that’s independent of the sponsor that would get the access to ongoing trial data that are unblinded? And, in fact, might be the DMC be one option for actually carrying out that SAC function? Remember we had talked about principles that guide the integrity of the DMC process Aren’t these principles also relevant to the integrity of the SAC process? So, in particular, for the category C event monitoring being done should it be done by a DMC-like independent entity? Where, in essence, to have unbiased judgment as you’re assessing the potential causal relationship shouldn’t you have multidisciplinary, independent folks doing this? And it’s important to maintain confidentiality to avoid pre-judgment A DMC-like review process to carry out this event, this category C event monitoring, also has certain advantages A DMC sees everything so if the DMC is trying to understand whether or not stroke or MIs are SUSARs the DMC is going to see TIAs, angina, and other issues as well that are additively informative The DMC also sees the results by randomized grouping so they’re not solely dependent upon an investigator assessment of relatedness Many data monitoring committees that I’ve been on, in the placebo arm, I see related events The investigator didn’t know they were placebo or I’ll see 20 against two and none of them are called related because the investigator didn’t understand So by a DMC-like process you’re gonna increase the sensitivity and specificity of assessing SUSARs Well people say well wait a minute but a DMC-like process doesn’t meet in continual time Well I do wanna meet in continual time for category A, B But for category C

isn’t periodic monitoring adequate? Peter was already pointing out, correctly, that there are multiplicity issues that have to be taken into account Mary was pointing out how even at the end of a trial you often don’t have sufficient data We saw, from the LIGHT trial, how early data could be very misleading and so continuous monitoring isn’t probably necessary for category C The data monitoring committee doesn’t need to be but could be that entity that provides the category C monitoring Most DMCs monitor single registrational trials but I’m on three DMCs right now that are monitoring an overall clinical program that’s evaluating a single investigational agent We are looking at all of them This would be a particularly ideal setting for a DMC to be carrying out that part of the SAC function that’s looking at unblinded data So what are some issues that have to be addressed? Salim was talking about this already too What are the standards that we use on the data monitoring committee to take action? We look at the entirety of the data in a benefit to risk context So if I was looking at a platelet inhibitor and MACE was the primary endpoint and we were seeing, on the DMC, an increase in hemorrhagic stroke or we were seeing fewer ischemic strokes and cardiovascular deaths and MIs we’re gonna continue that trial We’re not gonna report results out and so the DMC should proceed in that fashion If the DMC is serving in the dual role there may be a different set of thresholds that we would use for IND safety reporting and that’s an issue that will be discussed later this afternoon There will need to be proper communications between the SAC, the DMC, and the FDA and, in my closing slide, this leads to another important operating characteristic If the SAC is going to have access to unblinded data from ongoing trials it’s imperative that, I’m noting at the bottom, that there’s a charter that ensures that we’re going to not only obtain reliable and timely reporting of SUSARs but that we recognize the integral importance of maintaining confidentiality to protect the integrity of those ongoing studies – Thank you Tom Both the presentations were really excellent and set up our discussion very well So we have several reactants and we’ve talked, previously, about some discussion questions and we’re gonna, in the interest of time, weave those questions into the reactants but I wanna hear from everybody before we come back to the presenters So I’d like to start with Amit If you could give us a reaction to what the presenters have said And also, because of ACI Clinical working with different sponsors and setting up DMCs and so forth we’re particularly interested in hearing from you what existing frameworks you see for safeguarding safety already in place, whether there’s consistency across companies, and what it would mean to now setup these SACs or SRCs, and could similar procedures be used? And you’re mic’d so you can sit if you want – Yeah, yeah, fine – Okay – Thank you Lisa Thanks for the organizers for this very important meeting I think everybody’s waiting to get to the next step with this guidance on draft to the finalizations in some form So, again, as Lisa mentioned that I work for ACI Clinical who manages and supports safety committees whether it’s a DMC or it’s adjudication And of course when the guidance came up that’s of awful importance to us to see how can we use the current technology and the procedures that we have for the safety committees into the SAC area? So, from that angle, we talked to several of our clients to see what their thinking process is and how can we work together? And we talked about piloting some of them And so what I share here is sort of what we have heard from different of our clients small and big pharma and biotech And many of the discussions I think especially with respect to DMC actually are very aligned with what Dr. Fleming said in some sense because and we will talk about that I think that the sponsors for our feel strongly about the independence of this FDA monitoring as part of the SAC and there isn’t again since morning we have been talking about the unintended consequence of study integrity being lost or some way we have less confidence in the study given all the several

times that we have gone through this unblinding and so on and so forth So I think having some firewall or even totally independent safety monitoring which is a similar DMC type is very useful And the other thing is that about safety This is about safety and a lot of times just looking at the DMC even their physicians do focus on safety but they do also focus on the benefit risk and so there might be a different set of physicians who might be more focused on safety but that’s something we can think about in future SACs The expectation is we’ll meet more frequently unlike DMCs which probably a little less frequent Three months in general Sometimes it goes for six months So how can we incorporate that frequency into the SAC if we go to an independent DMC-like? Because, of course, no-one wants to add more cost than the source especially for small companies It’s a burden to institute another SAC in addition to DMC for example Background rate has been in discussion everywhere What is the background rate? And we have talked about already many times today and we will be discussing more And especially in rare disease oncology or older populations there’s so many studies and it is hard to see whether we should look at RWE as we heard about in the morning Fair database I heard that so many times The FDA has the safety database Can we get some guidance from there? So I think there’s expectation that maybe from FDA to best of my discussions is maybe there would be a library of the safety events, the rates, maybe interval not estimate for all of us to start off, abide by, and say that’s the FDA’s expectation And when we support and we submit a SAE or SUSSAR to FDA that will be checked against them So maybe we should have in form guidance from regulatory agency about the potential safety events rate All discussion about threshold whether it’s the quantity of the discussion or it’s the medical discussion I think it’s both So I think the other things we heard from the different clients that we talked about and some of the concerns were, of course, degradation because if we have a program level SAC or DMC-like activity there could be a potential for unblinding some other ongoing trials and how can we institutionalize that when there are several trials ongoing and some of them initially unblinded because we are looking into indicated safety information? And looking outside of the similar computer drugs who decides what the non inferiority margin should be? So there are several of these typical questions being asked And then there are adjudication Some of the safety events could be adjudicated That might take time and, as we know that in DMCs, we sometimes come back at the next DMC meeting to talk about adjudicated events which our investigator reported in the previous DMC So how can we accommodate that type of issues in an SAC format? And so I think there are a lot of questions, lot of interest from the different clients that we have and we have discussed with some of them about implementation of some of these But the question everybody’s waiting for the next version from the draft because everyone realized it’s not final yet It will be changed in some form or the other There might be a guidance So I think those are the things We have some implementation ideas about incorporating blinded review and then using a DMC with a different scope to do the SAC activities And so those type of two step approach we thought also that might help in reducing the issue or addressing the issue of the global alignment of the safety reporting as well So I will stop here and– – Alright thank you for those thoughts When is that guidance coming out Bob? Just kidding (audience laughing) I can ask that because I was at FDA working on it (panel laughing) (audience laughing) I’d like to go to Erin on the phone if we’ve got her And Erin what I’d love for you to do

is just give general reactions to their presentations but also in your role working on cardiovascular trials if you could talk about what it might mean if you had a data monitoring committee and a safety advisory committee on the same study What do you see might be some of the challenges in operating with both of those groups and the communications and so forth? If you don’t mind taking that question – Of course thank you so much for the invitation to participate and the FDA and Duke So first apologies for not being there in person I was pretty skeptical about whether the storm would amount to much and it turns out we got more than a foot of snow and I have no power at home and most flights were canceled So, let’s see, so if you’ve heard I am an investigator with TIMI Study Group at the Brigham and Women’s Hospital and as an academic research organization we have fairly large cardiovascular outcome trials mostly to phase three and phase four And so what I figured I could do is speak a little bit on our experience and maybe then build off of that on how there might be some room for alteration or improvement that’s gonna meet the FDA’s goals So we, in our trials, we universally have DMCs and with those DMCs they have several different functions as Dr. Fleming outlined So, first, obviously they’re reviewing efficacy and to Dr. Granger’s point this is a very targeted endpoint that have been sort of predetermined, are adjudicated, the database is sort of screened and scoured for these events so that there is not under-reporting And I think that’s a very important point that unless you really target these specific endpoints there will be under-reporting because we’ve had experiences in our trials where we call up certain events and we don’t call up other ones of interest And you just see that, with the population, you’re seeing much lower rates than you might expect but, obviously, the DMC is looking at those periodic points throughout the trial in an unblinded fashion But then, also, we call out safety events and one’s adverse events with special interests which then are targeted based on pre-clinical data, mechanism of action, theoretic concerns, prior clinical data, maybe post-marketing signals depending on what level the trial is at And, again, these are targeted and ideally, and hopefully, sort of well reported by investigators and the DMC in our studies is looking at this at very regular intervals depending on the study maybe every six months And so that would sort of highlight the category C type events The summit events The sponsors and looking for the rare events, the A and B events, and trying to unblinding as necessary for those But the DMC for our studies also has, Dr. Fleming may have mentioned this, but also in our studies we’ll have a function where they get a listing of serious adverse events unblinded at these regular safety look I think that there are some challenges there So it’s obviously trying to get to the bottom of are there any category C events that are concerning and is there an emerging signal that’s concerning? But I think there’s also some challenges with this and I think first thing is what was previously discussed about the terms and the coding and the fact that you have this listing of interim data, but not perfectly clean, but where there may be overlapping terms that, individual, are not constituting a signal There’s no imbalance but then when you combine them, in fact, appropriately combine them there may be a signal So it’s sort of being able to sort that out I think some of the other pieces in terms of having the DMC specifically be involved in IND safety reporting which is not generally what we’re asking them to do These are statisticians and clinicians and they may not really have the expertise to sort of know what the recommendations are and the guidance for safety reporting And also we’ve also discussed the fact that there are different thresholds potentially for an IND report versus trial stoppage and that’s important for whoever is reviewing these events to be able to appreciate that there should probably be different thresholds for these And plus the DMC is obviously not necessarily gonna have the knowledge of the other studies May but may not The other studies that have gone before

and although they, of course, know the drug that’s being studied they may not have detailed information about the sort of mechanism of action in a way that the sponsor would have for being able to sort of parse out in this listing of AEs which ones are potentially really important and should be focused on And then finally, again, in many of our studies these are cardiologists and so they may not have an area of expertise say if it’s a pulmonary signal So, as I was reflecting on those, I do think it’s probably within the purview of the DMC to be able to do this because I do have concerns like others do about having large scale unblinding that is under the umbrella of the sponsor in terms of trial integrity But I think that there’s probably ways that we could do it more efficiently So maybe, for example, having somebody as we have a combination multi-disciplinary on the DMC already clinician, statistician Maybe putting somebody with some safety expertise on the DMC who is sort of firewalled who could remind the group about what the expectations would be for IND reporting I also think we have open sessions where we speak to the DMC beforehand to say here’s what’s happening in the trial, here’s some things of note, and maybe we may tell them about if there’s been some important IND report But maybe formalizing a way in which the sponsor safety group could speak to the DMC in an open session and say this is what we’re seeing We have concerns that maybe our background rate of X, Y, and Z is a little bit higher than it should be And so could you while you’re reviewing the unblinded data focus on that? Or we have just released an IND safety report on this Maybe keep an eye on other related events And then also empowering the DMC to be able to get additional information around, for example, causality So going back and getting a narrative and being able to sort of appreciate whether there is potential causality here And then also be able to request experienced, for example what I’ve mentioned before, maybe a pulmonologist for something where there’s a pulmonary signal So I do think, as Dr Fleming was suggesting, that it may be possible to have a DMC do this and I think in some sense they already do within the trials that I’ve been involved in but I think there’s probably definitely room for improvement and for being able to sort of really target the IND reporting in the way that the FDA would like to – Okay thank you very much You made some excellent points about the DMC taking on the role of the SAC and we’ll come back to this as well Elissa I wonder if you, coming from BioMarin, I wonder if you could talk from a smaller company I don’t know how we define small, I agree with Janet, but smaller perhaps than a large pharma company About the challenges in general for interim unblinding and what it would mean to setup an SAC Could you do one in house? Do you have enough size basically to put an internal firewall in or whatever else you’ve been talking about? – Thank you So BioMarin is a small-ish company. (laughing) And we do tend to run small studies in the orphan disease space and because even the registrational, pivotal studies are open label and small in size we do try to limit the data to the appropriate personnel So we have had some success in using data analysis plans to blind the majority of the study team to sort of the pivotal data and the majority of folks within the organization But having a safety assessment committee I think does pose a challenge in terms of manpower and actual company size And how to sort of ensure independence and firewall of people if we were able to pull this off and keeping that sort of protected within one group or one set of individuals within an organization So I would agree with Dr. Fleming and with Erin in that I think this is a role that the DMC could take on I know in my experience at BioMarin in our open sessions of the DMC myself as the Pharmacoviligence person on the program that I do provide not only review of the safety and adverse events but also any interim events that may not be in their data cut packet So I think it is challenging to sort of have both committees sort of running simultaneously

within a small company and I think more so it’s a practical manpower issue given that, within a small company, the biostats group, for example, may also sit within the same area So while there’s SOPs, and training, and deviation plans that may be put in place it’s still challenging to walk by and somebody’s screen or there’s something on the printer So maintaining that level of independence and firewall I think certainly is a challenge in a smaller organization – Thank you That just reminds me of experiences working I worked briefly in a small company and there were times I wanted to come to work with a paper bag over my head so that nobody could see my expression – Right – Having looked at the data – Exactly – So most of you know this but with Tom and Dave we have two of the most world renowned and respected authorities on data monitoring committees, their function, most experienced Both of these gentlemen paved the way for data monitoring committees in the NIH model and then carried that to the pharma trial model which happened, I think, a little bit later That’s my memory and, in fact, the first DMC I ever reported to for a sponsor was in the 1990s and Dave was running the DMC and I learned a lot They’ve written extensively about this So we’ve heard from Tom We have an opportunity now to hear from Dave and, Dave, I have two questions I want you to answer. (chuckling) So first, from your experience, what is the feasibility of a DMC taking on the IND safety reporting role in terms of dealing with different thresholds, different types of risk assessment versus benefit risk assessment and so forth And then, second, if a separate SAC was set up and you were on the DMC what advice would you have about these two groups functioning together recognizing they may come up with different advice about the same risk or the same signal and so forth? Just based on your extensive experience If you don’t mind – Yeah (Lisa chuckling) Well I attended my first DSMB (audience chuckling) meeting 45 years ago this month I didn’t have any gray hair in those days and I guess that’s an association I don’t know if it’s causal or not but at any case (audience laughing) I think I’m old enough I’m going to answer your questions but I’m gonna add a few of my own comments – Absolutely So I wanna put in a plug for the paper that, in some sense, Janet Wittes summarized It’s not an easy read but it’s a very great read because it, probably better than anywhere else, it summarizes what’s going on And there’s a nice section in there about the state of affairs about the data we collect None of this matters if we can’t get the data straight in my opinion and I don’t think we’re getting the data straight It’s, as Chris Granger alluded to, we have a passive system and from this passive system we try to get structured data which is not possible So Salim was complaining as I have been for many years if we want good data then structure it We know what the deal-breakers are We’re talking about events that would stop a trial in its tracks or a product in its tracks We’re not that stupid We know what they are or most of them I mean of course it depends on the context of the disease So let’s collect that data in a structured way If you wanna collect all that other stuff fine but you know there’s two ways to hide a signal You make it so granular you can’t see anything or you drown it with noise and that’s exactly what we do every day and I see page after page of adverse event reports I have no idea what it means It’s full of ones and zeros and twos You can’t make any sense out of that Anyway so that’s problem one I am amazed when people say I want you to be on a DMC but you only look at safety Really? (panel snickering) It’s always a risk benefit assessment As has been alluded to several times there are many cases where yeah you see some problems but the benefits overwhelm the problems and you’re willing to put up with a little bad stuff if you’ve got a lotta great stuff So I’m always amazed and I refuse to participate in such things by the way I don’t wanna get caught in it I have been puzzled from the beginning of these guidance If you have I call it the X, Y, Z problem If X is the data you know about it’s published, it’s been presented and now you add Y, and you get X plus Y equals Z and Z is known I think my four grandsons could solve that equation and figure out what Y is as you’ve unblinded the ongoing studies So I don’t get it If we’re trying to protect the blind we’ve set up a perfect system to unblind it So I’m puzzled by that and frustrated by that Firewalls I don’t believe in firewalls

I don’t think, I agree with you, firewalls in a small company would be asking the impossible It’s just asking too much But even for a large studies or large company, whatever that means, I still don’t believe in firewalls because it’s not that anybody’s handing you a piece of paper with the data It’s your activity, your mood, amount of time you spend in meetings In my groups when I was at the NIH, and it was constant, I always knew what was going on more or less because I watched the activity I could tell whether they were smiling or not smiling (audience chuckling) So I don’t believe in firewalls big or large I would support the concept, I’m calling it DMC Plus, there are examples, the AIDS group, which I was part of a long time We monitored God knows how many, maybe 50 trials, in that portfolio It was very structured and it was very demanding but we did it Another sponsor I’ve been working with for 10 years we monitor all their trials for this indication So it requires more will, more work, more discipline, a little more education, but you can do it I don’t see how you could have an SAC that didn’t have the operating principles of a good DMC I think which is what Tom was mentioning at the We’re not behaving that way I wouldn’t know what the value was So you know you can call ’em what you want but if you have monitoring one study, one monitoring committee for one study, maybe you need an uber DMC Plus to look at the big picture But many DMCs have and can monitor a portfolio so I think it’s possible to do that In fact it’s been done and done pretty darn well in my experience but, again, at my age I’ve learned to never say never So there’s always the exception perhaps Thank you – So I think we heard Dave’s grandsons are eligible for DMC services now (audience laughing) (panel laughing) Alright this is a serious topic So, Matt, the FDA always gets the last word (audience chuckling) – Get the last word – So Leia I know from my stint at the FDA We get really, I can’t say we anymore used to we, get really nervous that there’s too much unblinding going on and Tom gave a great example of exactly that happening or our concern about it happening We were so nervous about it we had a part 15 meeting on interim unblinding in 2014 that Tom presented at What are your reactions to what you’ve heard and what are the concerns from a statistician in the safety group at Cedar about setting up SACs, having interim unblinding, possibly some amount of that happening inside the sponsor and so forth? – Well thank you How I’m gonna approach this is I kinda wanted to draw a parallel to how we evaluate cardiovascular risk in the type two diabetes setting where, if you’re not familiar with that, 2008 guidance is ultimately it sets up a framework where you have to rule out a 1.8 relative risk of cardiovascular risk at stage one which we usually use for gaining marketing approval and then later you have to rule out 1.3 And one approach to address those two risk margins is sponsors have done a single cardiovascular outcomes trial where you do an interim look in that ongoing trial to look at the 1.8 risk margin So this setting very much sets up, I think, something parallel to how an SAC might be thought of where a sponsor has to be unblinded to those interim results We saw this early on We saw it as a challenge Lisa mentioned we had a part 15 meeting to kinda look at, get other feedback on, but also to kinda evaluate how we were doing things and the processes we requested from the sponsors to hopefully have some degree of a firewall of that data for ongoing trials So really what we do in that setting or what we’ve requested is we have sponsors fully document who’s gonna have access to that interim data, what their role is in the company, what level of access they have to the data We also ask for information on how the data flows to that internal group within the company to, ultimately, they have to give us the information from the interim of the trial So I think what we’ve learned after implementing this and using this approach for several years is that, obviously, on paper it looks good The firewall we think it’s there It’s a small group We feel everything’s in place They have confidentiality agreements

The data flow seems to be correct but I think, in reality, there really is no true safeguard to say that that firewall does exist and Dave brought up a great point is there could be a mood within the company There could be something going on So for me the question is, trying to parallel this into the SAC world, is if you have internal sponsors looking at this interim data are they ever truly firewalled? Can we believe it? And where it’s hard is, as regulatories, we try to figure out did that firewall hold or not? It’s really hard to do We’re not there watching the trial ongoing We’re not sitting in the company We don’t know and we probably never will know and when things do not happen, Tom brought up a great point, the LIGHT trial which our division reviewed And it was only after a little tip here, a little thing there, things just didn’t look right that we started really asking more questions and that’s when we figured out ooh that firewall isn’t in place at all So maybe it was luck We don’t know if we would catch something like that again So it’s very difficult to really do Another point I wanted to bring up is when we’re looking at interim findings from ongoing trials and, again, most of my experience is gonna come from a lot of the post-marketing setting We’re looking at large outcomes trials They’re multiple years in duration They’re big trials And, again, these I think are the ideal trials as brought up earlier that if you’re gonna find something for safety these are probably the right ones because these big ticket safety items they’re not at a high frequency rate so you do kinda need this large trial to look at it So when we’re in that setting and we have been in a position where we’ve actually had a DMC come to us and say we have a signal We have something we’re concerned about and what we’ve talked about internally at FDA is well what are we gonna do with this? We have one side of the equation We don’t have the full picture You’re gonna maybe only report to us a risk so, in those situations, we have asked to say okay report to us Don’t tell us what direction that signal is, where it’s going We just wanna know okay you found a signal What are your plans? And we deferred to the DMC to ultimately protect the trial and the patients under their original purviews So, in general, I think where I would come down to things is when we’re looking at this category C is I would probably lean towards where Tom’s proposal is is that ultimately for these types of events the DMC is already situated to really look at them And if, for some reason, a sponsor does need access to the interim of any inner finding that it’s limited to as few as possible and if there’s any way to kinda roll them off as the product’s maturing I would feel that’s an ideal situation – [Lisa] Thank you so much for that insight So I wanna ask first if Tom or Ajay have anything to add having heard the reactants before I go back to a couple more questions – So, thank you Lisa I guess this has been a great discussion and I sorta took some notes here and I’m looking at four major themes And one is external versus internal I think we’ve all discussed that and clearly I think everybody agrees if we have external bodies, firewalls, we’ll be more effective I don’t think anybody disagrees with that The issue then becomes what are logistical practicalities of that if we have SACs across the board for the developmental portfolio? I mean that I think is an issue so having them all external is one way or having an IDMC take them on but then we would have to have some way to look at a study that don’t have an IDMC So we would only restricting looking at category C for the large, pivotal studies We would not be able to look at it for the small studies and that’s fine I think that’s one way to go But the remit obviously of the IDMC versus the SAC would be the second point which is a bit different right? Because that IDMC looking at benefit risk whereas for the SAC, presumably, we wanna identify early signals Early emerging signals that are not showstoppers They are not gonna stop the study That may not even impact the protocol conduct but that we think are potentially causally related That’s a pretty different bar and so that I think is gonna have to require a major updating of all the charters and the DSMC members are gonna have to realize

that look we’re gonna have to now understand that any emerging signal would require us to identify the sponsor and perhaps prep the FDA Which brings me now I think to a third point that’s been discussed for holistically years Interpretation of interim data I mean that is an issue I mean if we’re gonna look at interim safety data with small numbers and we all realize that the contemporaneous control is the best control it’s gonna be a lot of instability, right? You may see a three versus zero at three months of a really important event and at six months it may be four versus three the other way So where do you draw the line on sending that in? And in that I think there’s an issue that will need some further clarity and then, finally, if there is an SAC constituted in a study that has a DMC I think it may be even somewhat more complicated so, to take your point, for the type two diabetes studies, we would probably have a IDMC, an adjudication committee for the cardiac endpoints, And we may well have a safety endpoint adjudication committee for pancreatitis And now we’re gonna have an SAC on top of three other committees Now how do they interact? And I must confess it would be very difficult for a sponsor to question I wouldn’t want to question if a body that’s looking at unblinded data that says you’ve got a signal I have no way, any way, to put any judgment on that ’cause I have blinded data So now we may have four committees and would the SAC then talk to adjudication? We’re into a lot of complications there so – I think those are very good points thanks And Tom your reactions too and, if I can, I’m gonna embed another question. (chuckling) So we point out in our FDA guidance and Bob and I have had many hours of conversations about what trials need DMCs and the fact that since the FDA guidance was published in 2006 many, many more trials now have DMCs than did at that time But there are trials that don’t have DMCs and so we haven’t talked, I don’t think anyone has, about a trial that doesn’t have a DMC but then would have an SAC What do you see are the challenges there in terms of with the SAC in place would there be a temptation for the sponsor to expect the SAC to maybe all of a sudden now act like a DMC? I mean do you think that they could keep their remit focused just on IND safety or do you think there’d be a concern that it might morph into a DMC? Just if you have time to answer that – So I’ll try to circle to that at the end but I’ll try to also be brief Listening to the thoughtful comments by the others a really critical issue here is we do want to be able to identify in a timely way safety risks Everything is in the context also, though, for benefit to risk and we obtain so much valuable insight in evidence-based medicine from our larger, randomized clinical trials and we surely cannot compromise the integrity of those studies and confidentiality is integral to the integrity of those studies So there’s no absolute There are reasons to, in fact, share confidential information beyond the DMC I call it need to know but we should be persuaded that the risks that will be induced from certain levels of sharing of that information is kept as small as possible and that it’s unavoidable relative to the gains that we hope to achieve So my sense is, for category A and B, yes we do need to have medical monitors Practically speaking they’re gonna be within the sponsor They are gonna have continuous monitoring They are going to, in fact, in some cases have access to unblinded data and my sense is the gains that we get from that offset or are greater than the level of compromising integrity of the studies I remember in 2008 when we were on the endocrinologic and metabolic drugs advisory committee and we had been in type two diabetes and we had seen so many studies done that had 300 person years involvement and we found out non inferiority in hemoglobin A1c And we were saying we do not understand adequately what the effects are on microvascular macrovascular complications that will come from large scale randomized trials But we were practical saying we wanted a paradigm shift from 300 person studies to 40,000 person year studies which is what it would take to rule out 1.3 And so we took a compromise and FDA was very responsive in implementing this compromise of the 1.3 1.8

where you get into the market at 1.8 and you finish it at 1.3 So, in essence, it was a compromise to move us from 300 to 40,000 in a feasible, practical way that still got these drugs onto the market in 8,000 But it’s imperative that we get those gains with the minimal loss of integrity to those studies And FDA, and industry, and academia has worked very hard to think of how to do this and it does depend, to an extent, on firewalls and I gave an example of a failed case but I will say there have been many successful cases where sponsors have carried this out Is there a better way? I would love to not have to firewall it but I would love to not have to release the data when I’m on a data monitoring committee and I know that the study truly has to rule out 1.3 to be reliable And yet if we’re going to be able to have a feasible approach and we’re gonna get timely access to interventions this is what we have to do Well now we’re talking about category C events and for category C events to me it’s imperative, as has been pointed out, that you need to understand these in the context of randomized comparisons that are unblinded and that are based on understanding the totality of the data And that leads, to me, to the conclusion that it’s a DMC-like, not simply a DMC, but it’s a DMC-like process The DMC may be ideally suited to do it but if it’s done outside the DMC I have great concerns if it’s done under the umbrella of the sponsor It would be ideally separate and it might be coexisting committees where the motivation for this could be that the SAC is looking at many studies although a DMC could also be looking at many studies The fundamental issue here is I’m not persuaded that the gains that we get by running an SAC within the sponsor, particularly in a small organization, broadly unblinded are offsetting the risks to the integrity of that research that’s integral in evidence based medicine and this is the trade off So do we need DMCs in all studies? No we don’t and there is clearly an understanding of settings where you do where it’s inverse morbidity, mortality, where there are major risks for the intervention, There are, however, a lot of other settings where a DMC can be very, very helpful to enhancing the integrity of the research When we’re on monitoring committees my first and foremost responsibility is safeguarding patient interests but it’s also to help enhance the integrity and credibility of the trial which also is improving quality of study conduct and it’s also based in settings where sponsors now, and it’s happening more often, have said we want you to monitor all of the development In fact it was something I think Bob Temple said to me a long time ago which is shouldn’t the DMC potentially monitor a program of related studies? And, as Dave pointed out, we effectively monitored 50 trials We monitored on the HIV AIDS data monitoring committee all of the ACTG CPCRA trials for 20 years which was an extremely efficient way for an independent entity to be able to understand the totality of the data So my sense is there’s no single right answer for where you do have one but when you don’t have one then I certainly do see that an SAC type arrangement could be very useful but it should be as independent as possible and, I agree with Dave, if you’re going to monitor safety issues you need to understand them in the context of benefit to risk And so you should be seeing the totality of the data which, by the way, seems to me that I’m circling back to saying data monitoring committees (panel laughing) (audience laughing) are really key – Okay so we have 10 minutes First of all session two needs to get some credit for getting us back on time (audience laughing) We have 10 minutes for audience speakers before lunch and if you would go to the mic, while you’re going to the mic, if we have any, sorry, audience questions If you have any questions for this terrific panel that’s up here We did get a question on the phone from Walt Offen at AbbVie about why can’t a DMC serve the purpose of a SAC in terms of determining when to report SAEs to the FDA and I think we’ve talked about that quite a bit already But then Walt went one step further and said could the regulators develop rules like setting p-values for these imbalances and so forth? And so I’ll just let Matt answer that and I’m guessing it’s a one word answer (Lisa laughing) – (laughing) No (audience laughing) – But, in all seriousness, we haven’t talked much about thresholds Tom mentioned them in his presentation

but that hasn’t been one of our topics Maybe it’ll get discussed more this afternoon but an imbalance to one group may not be an imbalance to another group and I think that gets into the DMC SAC as well Yes, thank you – Natalie from Novartis I have a question Since this morning we are talking about aggregated data and I understand that the DMC really when I am looking at one single trial and I am looking at the DMC for this particular study I am looking at the clinical database and I’m really looking at the totality of the data as you just mentioned When you are talking here about aggregated report for this particular activity are we talking about pooling database which are clinical database or are we talking about looking at our ArcGIS system and looking only at the, as you said, adverse events which can be quite different if you want to understand the totality of the risk So the question is really clinical database polling versus really only looking at ArcGIS which is really only the SA systems – If I’m understanding the question I know this afternoon there’s gonna be a discussion about aggregated data and meta-analysis The setting that I’ve been talking about I’m on several data monitoring committees that are monitoring the entire clinical development program for a single experiment investigational drug We monitor those studies as independent studies Each study is complimentary in terms of asking a certain question about the use of the intervention in a certain manner But we do have the benefits of seeing, collectively, what the adverse experiences are so a correct point that’s been made is if a monitoring committee’s monitoring a single trial and yet there are many ongoing studies that monitoring committee is very well informed about benefit to risk in that trial but doesn’t see other studies Well the sponsors that have set up single monitoring committees to monitor a program, I think, get the best of both worlds because it’s an efficient process for each study to be monitored for its independent objectives but where the DMC does have the insight, collectively, across the program of the emerging safety experience – [Natalie] But when you’re looking at, in this particular experience, you are looking for example at the vital signs, at the laboratory data You are looking not only at SAEs right? – And it’s– – It’s a broader picture that you are looking at – Amit you wanna? You wanna– – Yes I just wanted to add word from Dr. Fleming that we believe that if we have to have SACs start off incorporated within the DMC Plus type of situations it would be good to have a program level DMC because of the reasons we already talked about So it will go back to the clinical database That’s not for that trial but for other trials of the same program as well – [Lisa] Yes, next question – [Audience Member] I agree that for reporting we need to do an unblinded analysis Somebody needs to do an unblinded analysis but, and I don’t know if this is gonna be talked about in the afternoon session, I think if we take a two staged approach that we could leverage the scientific expertise and the medical judgment of the multi-disciplinary safety management team That’s where the strength is in the study team, the blinded team So there are methods that, quantitative methods, that we could use to look at the blinded data if we have information, again, looking for the wolves not the chipmunks And then if there’s some discomfort with what is being observed then you could reach out to an unblinded group and, in that case, if it’s not a broad and frequent unblinding like for an SAC, which is recommended in the 2015 guidance, then we could take advantage of this two stage approach I don’t know what kinda reactions that the panel would have about that – So– – Ajay – I guess in some ways we do that now So I could give you an example Now this may not be ideal but there are several instances where the SRT, which is blinded, will look at a cluster of events and some of them may be unblinded because there were SUSARs Others are not unblinded Hypothetically that’s every serious event and we have a total of 20 events and five are unblinded All five are on active drug Well that makes us somewhat worry What we would probably do is send that information to the IDMC

and say look we’ve got this cluster of events We’d like to work with you What more information on AEs and to the point that, you know, it is a data dump but can we facilitate getting you the appropriate AEs and lab values for you to be able to make some sort of adjudication on this cluster of events? We would work with the IDMC to come up with a data set that they could use to see if there’s something concerning Now I think the challenging part comes in and maybe protocolly you could speak to this let’s say the IDMC then says yeah there’s imbalance but it’s nowhere near to tell us that the study should be stopped or changed How do you communicate that? What we would likely do is having a steering committee which is maybe the CMO or one other person that’s sort of sitting between the IDMC and the study team to get that information I don’t know if you’ve had similar types of – I agree with you and my sense is the monitoring committee may readily in a case like that say from the perspective of equipoise in the trial we should continue And if I were simply monitoring the trial to safeguard patient interests and to enhance trial integrity I wouldn’t do any reporting If, in fact, though I needed to carry out IND safety reporting then we’ll talk this afternoon, I think, about what nature of thresholds would be in place And if, in fact, this event required then timely reporting to FDA as an IND safety report then that should happen and we should talk about what’s the most effective way for the data monitoring committee, upon seeing this, to report this to an SAC or to the FDA – [Lisa] I don’t know Alyssa if you wanted to respond at all about– – Me? – Actually I was looking at Alyssa About whether that’s practical the two stage approach he was just talking about or maybe that already happens in BioMarin – Yeah I mean I think that that’s pretty common practice I mean our trials for the most part, as I mentioned earlier, tend to be open label but, even so, if there’s even any even small level of concern it would be then to bring it to the DMC and if the trial was partially blinded, partially unblinded there may be data that is protected that the SMT or SRT can’t see and then you could bring that unblinded data to the DMC for review in a closed session – [Audience Member] Right, so– – [Lisa] I’m sorry There’s a lineup behind you. (laughing) I promised I’d finish on time – [Bob] Hi, I just wanna separate the question that’s fundamental to our guidance and the larger question of whether you should stop the study or something That is not what the guidance is about It’s only to determine whether the event that’s being described is really increased in the drug group That’s all it’s about It’s not whether you should stop the study or anything like that It isn’t even about whether you should change what’s in the informed consent all of which is relevant So my question is what are we worried about potential unblinding? So I’ve got, oh I don’t know, GI bleeds and they’re serious and I have a bunch of ’em and I wanna know whether there were more in the group that got the drug than the group that didn’t So I unblind only the drug that the people who had the bleeds were on I don’t unblind anybody else in the study just the seven people who had bleeds and I find out what drug they were on Why does that make me worry about the integrity of the study? It’s a tiny fraction of the total patients We don’t know anything about the effectiveness of the drug in those people at least if they don’t look at that All they look at is what drug the people who bled were on – Right, so Bob– – What are we worried about? – There’s a continuum and in the example you’re giving actually feels to me a little bit more like category B than category C So I’m with you– – No, no, no, no, no, wait – You’re unblinding a very small number– – [Bob] Sorry GI bleeds are common in this population The presence of a GI bleed doesn’t tell you anything You gotta know if it’s unbalanced – And if they are common in the population now then I’m we’ll talk this afternoon about thresholds but I’m not sure why seven, unless it’s seven zero, is going to be sufficient for you to declare that you have causality – [Bob] I’m not saying it is All I’m saying is that to define whether it’s causal or whether you’re worried about it all you need to find out is what drug the people who bled were on Why does that make you worried about the integrity of the study? – Because, in fact, in principle to do this right in an informed way in many instances it’s more than seven people It would be a much larger number and, in fact,– – Why? – If you really wanted to understand this

it probably is best to be understood in a broader context I think the example was used about if you’re seeing strokes what do the TIAs show? If you’re seeing MIs what do the angina? You’re gonna look at more things and the data monitoring committee could do that So could an SAC but as the SAC sees more data now you’ve made a transition from a small number where if you unblind it doesn’t compromise integrity to where it’s now not a small number – [Bob] Lemme keep going there But you don’t know anything about whether the drug worked in that population You’re not even asking that and I would argue you shouldn’t even look What are you worried about? You now know the people who bled or had whatever it was were seven of them were on drug, seven were on placebo What am I worried about? – If I could– – Yes – Apologies I think if I read you right Dr. Temple that sort of is like the two step process that we were talking about before where a team might see a cluster of events, have somebody unblind them I think it’s my understanding of the proposal right now is for routine unblinding of all SAEs across the whole study There I do think the amount of unblinding is large as compared to targeted unblinding where you see a cluster and I think that’s where I see a bigger difference That we’re now proposing routine unblinding of all SAEs by this SAE committee – [Bob] Suppose that all they knew about the patients were what drug they were on and they had no idea whether they had study endpoints or not All they knew was the side effects – Great but that– – How worrisome is that? – It just gets to the point that if the committee is internal there’s gonna be a lot of people that are gonna be looking at unblinded data and concerns about maintaining firewall That’s all – So you’re worried that looking at that will reveal, in fact, to some people whether they did well or did badly– – It might reveal if you start looking at the number of events there are gonna be a number of people that are now unblinded – So – And so – And Dave you wanted to say something Well I’ll have this discussion this afternoon but one thing you learn very quickly in a monitoring committee seven zero I would make a bet It would go away And usually I win most of the bets (panel laughing) because you realize that data over time, wiggle and a waggle So I don’t know what that seven zero is telling you but I don’t believe it’s gonna tell you anything – [Bob] That’s a totally different question – No it’s not – Yes it is – Why do you need to know seven? – [Bob] Look a fair question You’re asking whether type C events ought to be even analyzed at all It’s a different question I’m not dismissing it I’m just saying it a different question – But I’m saying you wanna know something I’m saying that seven zero isn’t something that probably I would even pay too much attention to unless it became 20 to zero – So Bob you correctly noted– – I’m not dismissing the question I’m just saying what we– – In 2015 you noted that it was important not to deluge FDA with everything that’s in SAE – Right – There needs to be evidence of causality, relationship – That’s right – What we’re saying is when you monitor a lotta trials and you see seven events that are common it’s not just that we would go on because benefit exceeds risk It’s you realize that that type of data is completely unreliable as to whether or not treatment is inducing that So, by your own decree, you don’t wanna see those data Not until the numbers are larger and now we’re talking larger numbers Now we’re talking broader unblinding – [Bob] All I’m saying is you’re raising a different issue You’re saying these type C events are almost never detectable in a meaningful way and we should forget about it That’s a different question – Never’s a strong term – [Bob] I’m not dismissing it It’s a different question though – So we have to– – The current premise was that you are supposed to look at these events and see whether they’re unbalanced and if they are report it Maybe that’s a mistake – Define imbalanced – Well– – Because by your own– – That’s the question – Declaration you do not wanna see noise You wanna see signal – [Lisa] Okay so today is the International Day of Women (audience laughing) (audience applauding) (audience laughing) – [Janet] So that these women can prevent lunch? Is that the idea? (audience laughing) – I wanted to make sure Erin didn’t have any last words The woman on the phone Erin? Did you have anything else you wanted to add? (Erin laughing) (Lisa laughing) – [Erin] I just have two comments I think one of the challenges with the discussion that was just happening is, again, how do you know to look at GI bleeds? If you’re not looking at unblinded data which I think is the challenge that we were discussing before about knowing what background rates are which I think is very hard to do And so that’s one challenge and I think the other thing that I just wanted to maybe pose to the group, I don’t know that we’ll have time to discuss it,

but is if you have a separate SAC from a DMC and the idea that we would sort of want somebody who’s assessing safety to have the totality of information and to be able to see the benefit risk ratio in general I guess I’m a little bit uncomfortable with the idea that they would be then also seeing efficacy We normally limit the efficacy looks and so that’s just a bit of a challenge I think if you have another group who’s also looking at the totality of the data – Right so we have lots of challenges but we’ll clear ’em all out this afternoon (audience laughing) Janet I’d love to call on you but they’re really pulling the rug out from under me up here Thank you so much to our panelists This was a great discussion (audience applauding) – Okay thanks to Lisa and the panelists We’re gonna go ahead and take our lunch break It’s not a full hour now It’s gonna be whatever it takes to get us to 1:30 which is when we’re gonna start the next session If you’re not able to finish your lunch and, again, there’s a list of restaurants at our registration desk There are a lotta restaurants in the area You can bring your food back into this room You could also leave your stuff in this room We’re gonna be here so it’ll be fine (crowd murmuring)

(papers rustling)

Is everybody mic’d?

Are we waiting for someone? (man murmuring)

– Where? (man murmuring)

– Right now? – Yeah

– Okay, I can invite you guys to join me up on stage

(audience laughing)

Okay welcome back everybody

We’re gonna go ahead and get started

In this next session we’re going to take

a bit of a deeper dive into the approaches

for determining the threshold for reporting adverse events and consider key factors that might influence over-reporting to the FDA and how potentially to, the best way we can, mitigate these challenges by setting an appropriate threshold Joining me up here on stage are Ann Strauss Associate Vice President Clinical Safety and Risk Management at Merck As well as Anand Chokkalingam, Director of Epidemiology at Gilead Sciences – Salim Yusuf – Won’t you sit down on the stage? – Professor, Department of Medicine at McMaster University and then Gerald Dal Pan, Director Office of Surveillance and Epidemiology at FDA This session will start with a brief presentation from Ann and then followed by a panel comments and then we’ll open it up to the audience for discussion and if this session’s anything like the others I’ll try to start the audience discussion a little bit earlier because usually in public events we have like one or two people will go to the mic and we have like five or 10 minutes of audience discussion But for this group you guys are and we have exactly the same number of people from this morning We didn’t get people trailing off and we usually have a lot of people lined up at the mic so I’ll make sure to make sure we leave some time for you all to ask your important questions Ann? – Okay so (man murmuring) Oh thank you I’m never very good with this technology Oh, turn it off – [Greg] Just keep going – Oh at the okay sorry Lunch break. (laughing) (audience laughing) (papers rustling) So good afternoon It’s a wonderful opportunity for us to present at this meeting today I, as Adam said, I work at Merck in clinical Safety and Risk Management and we’ve actually had a project team that has been working on the guidance for about two years and that’s called the Long Term Quantitative Enablement Team and I am presenting on their behalf So, as you all know, this topic is when do you report an aggregate IND safety report? So and then, once again, I know you all know this but I just wanted to re-emphasize it is that the way we look at it is we would send down an aggregate report down to FDA if we thought there was a causal relationship between a drug and a serious and unexpected adverse event Or if we see a clinically important increase So that sounds relatively simple but reading through the guidance again I think one of the points that is so important is that to be able to figure out when to send something down to FDA as an aggregate IND report is a complex judgment call And I think one of the reasons why we all have jobs (chuckling) and it’s generally not a simple application of a planned statistical analysis So we looked at that and we thought okay so how can we go about it? How can we have some sort of framework whereby we can meet the guidance? So I think what’s important from a practical point of view and what we at Merck do is that for all our products we have a safety review team and that safety review team consists of multi-disciplinary experts And we review aggregate data on a routine and ad hoc basis and I think especially for aggregate reporting

we at Merck and at the CSRM had a much more intense relationship with our statistical colleagues So one thing is to have those experts and then the second point to consider is that we feel it’s important that to have an aggregate program level plan So to have what we call an AgSAP So that when you have a program going into phase three you at the end of phase two in general you plan and decide okay how are we going to monitor this? What are the important events that we are concerned about? What are the important anticipated events that we’ll be looking at and what are the background rates? So that you have a strategic plan before you start because I think in the past I’ve been at companies where you just really looked at the data from more of an ad hoc point of view So this is definitely planning before you go into phase three So if you look at the key components of our AgSAP the first part is obviously looking at program level safety topics and populations of interest Then you also have what’s known as an ongoing analysis of safety evaluation for blinded data and then you also have another document where you look at unblinded data And, obviously, the fourth part of the AgSAP is we prepare for our filing And having this all together helps with the reporting up to FDA So just to go through some of the blinded ongoing analysis of safety you’re basically trying to validate the safety profile of the product and you’re trying to detect the emerging safety topics And these safety topics and signals are validated by the safety review team And, as I said at the beginning, the very important thing for the blinded ongoing analysis of safety is being able to know what anticipated events you want to look at, try and find background rates because I think background rates are not always that easy to quantify and it’s not always that easy to even if you have phase two data it’s often just one data point in time So and then also quantitative frameworks So when our statisticians and our safety physicians got together they said okay what are the key questions to answer if we’re looking at blinded data do we have more events occurring than we expected? Yes or no and what is the magnitude of this relative risk? So we came up with a quantitative framework to stimulate safety review team discussions and improve conversations about safety monitoring But these are not decision rules So just to show you this is work that was done by us, my statistical colleagues, specifically Greg Ball and I’m going to read this ’cause I’m definitely not a statistician but basically it’s coming up with probability thresholds boundaries for incident rates So if you have an event that you’re concerned about you can see as your clinical trial goes along you can see whether your events are actually going to be crossing the threshold that you’re going to be concerned about I mean if you look at obviously the red line is the bit that if you cross the red line you’re gonna be concerned about So, for instance, as an example this is a simulation but if we had an event that we were concerned about and the safety review team was looking at this data we could say okay it looks as though what we’re seeing is not terribly alarming If we had to see it going increasing and crossing the red threshold that is when we would, the safety review team, would look at the blinded data, not unblinded, looked at the blinded data and see whether something additional needed to be done So, obviously, the first thing is to see

whether there is an increased risk and then the second part is to see what the magnitude of the relative risk is So this figure can be used to validate the magnitude of risk evaluation and the assumptions of background rates So the x axis is the pooled event rate at a given moment and the y axis shows the probability of risk evaluation above a certain threshold So it’s basically putting the data that we have into a framework where we can see okay this is higher than we expected and this is the magnitude And so yes this is something that’s concerning but that doesn’t mean we would automatically then send a report who would then have the clinical interpretation of the data So that’s with regard to looking at blinded data We also look at unblinded data on an ongoing basis but the unblinded data ’cause, as you know, a lot of our studies are actually unblinded but we do It’s not that they’re unblinded They’re open label studies So we look at those data when the trials are locked and we, once again, try to understand the safety profile of the product And the way we would actually look at quantifying that is by actually one of the ways that we do is based on Janet Wittes’s paper looking at those rules So we’ve developed a Bayesian version of this type of rule that Dr. Wittes published So, basically, the way that we approach this at Merck is that we have our RMST which is the same as the safety review team They’re reviewing the blinded data on an ongoing basis Once the studies are completed they’d look at unblinded pool data and other data sources If they saw something just from their review, they may see something just from review from a clinical point of view, or they could see something from a quantitative frameworks that trigger or that cross a threshold The RMST will then review all the other data that’s available with regard to that specific event and if we have a DMC we would then pose the question to the DMC about whether the DMC thinks this is causally related to the product or not And if the DMC did feel that they would feed it back to us and, in general, if it was actionable, so if we’re gonna update the informed consent, or update the reference safety information, or CCDS we would send it down to FDA as a aggregate report We don’t have DMCs for all our studies If we don’t have a DMC our Chief Safety Officer would generally look at that data at that point So, yes I’m sorry That’s a lot of data that I sort of like rushed but that I hope it didn’t go too fast but the way that we’re trying to do it is having the expertise when something meets a threshold not just having everything go down to FDA but having clinical input with regard to that and really seeing whether we think this is due to drug, whether it’s a serious event, and whether patients need to be informed and we then inform at the same time we inform FDA – [Greg] Okay great – [Ann] Oh sorry Conclusion slide (panel laughing) So appropriate expertise, proactive strategic planning, ongoing safety review, and some sort of quantitative framework – Okay thanks Barbara so I’m just gonna ask the– – Ann – Oh I’m sorry Ann (audience laughing) What am I looking at? (panel laughing) It was printed It says thanks Barbara okay – I can be Barbara (audience laughing) I don’t mind – Ann, okay Thanks Ann (audience laughing) So I’m gonna go ahead and turn it over to our panelists to sort of kick off reactions from the presentation and you’re welcome to stay seated if you’d like

or okay – Well thanks very much and it’s a pleasure to be here I thought I’d just add a few pieces which we’ve been thinking about at Gillead Just general sort of considerations I think it’s been alluded to earlier this morning there’s definitely a need to have a good comparator And so one thing that we’re considering when we think about thresholds is when do you want to use the statistical test? I think we’ve discussed possibly avoiding statistical tests when the data come from really different sources You really need a really good internal comparator where you can adjust and make appropriate comparisons The point I’d like to make so I have a number of points I’d just like to go through and ask questions even of this panel and of the audience If you have a good, unexposed comparator you have many options to make comparisons but you need to give consideration to things like multiple testing especially if we’re looking in the area of unanticipated SAEs Things like multiple corrections like the doubly robust FDR or other corrections might be important and then, of course, setting an appropriate p-value threshold and I’m gonna leave that one on the table for a moment But I think when we think about the different events that we have there are a number of different types of thresholds we can use Do we take this descriptively? Do we look at things simply from a standpoint of looking at a rate difference between two groups? Between a comparator arm and a treated arm? So we might think of, for example, for serious adverse reactions that are known we might be looking out for frequency category shifts Something big picture We might be looking at proportional changes or doublings when the risk is at a certain level already We might be looking at absolute rates and absolute proportions And then, of course, if there’s a comparator, a reasonable comparator, we would probably bring in some sort of p-value based on maybe something with an FDR calculation built in to it Those are just ideas These are not any specific position that we’ve taken but just ideas for thought around thresholds But it’s important to remember that there are a number of trials that we have at Gillean, I’m sure others have as well, where there isn’t an unexposed comparator And so it’s not really clear what to do here I mean the coming up with a background rate is problematic as we’ve discussed but we might often have a study drug versus study drug plus best alternative therapy So you see we don’t really have an unexposed group We’ve been talking today a lot about when to unblind, how that impacts trials, but one other thing that I’m particularly interested in talking about as an epidemiologist as it applies to background rates is the issue of observational studies because the guidance also makes reference to the need to summarize and synthesize data from ongoing observational studies when we have an ongoing clinical trial program So that’s something that is certainly an important consideration and that has some concern for us when it comes down to coming up with an appropriate comparison group there as well because, in many instances, we have a post authorization or post-marketing studies going on in many different countries and these are usually often just cohorts of exposed patients with no built in comparator And finally the last thing I wanna talk about and, again, pose in the context of background rates because that does seem to be a (chuckling) hot issue is where claims and EMR data fit in And I think that it’s been discussed today how challenging those can be Driving a solid comparison group based on claims and EMR data is very challenging Inclusion exclusion criteria, their exact event definitions ICD versus metric codes and frankly some events just don’t work in claims So I think for all the reasons that have been brought up earlier having a comparison built on claims and or EMR data can be very, very problematic Seriousness is a major one We can’t actually pull out serious as a definition from a claims database So what does that mean if we needed to make a comparison there? Would we have to take all events from our clinical trials that meet that definition regardless of seriousness? So that’s something and the last piece would be adjustment Understanding that the differences between a population might derive from a large claims or EMR database that might reflect something closer to a general population may not really reflect that of our clinical trials

So bringing in at least age and gender adjustment but other things as well So folks, thoughts for consideration and, as I said, I think those claims and MR considerations are important to think about not only in terms of the clinical trials but also in terms of how we would provide context for any ongoing observational studies – [Greg] Great thanks, Salim? – I was brought up with the principle that when you can do things in a simple way why complicated it? Now, categories A and B are simple Don’t complicate it Whatever you think just report it That’s rare the way Janet put it Well when we come to category three it seems that the majority of people from industry are asking that question a different way When you do things in a complicated manner why even try to simplify it? That’s the way all of you in industry are approaching it, sorry That’s the way I feel about it You can never interpret, almost never interpret, category C information without context, a comparator Without context which is the totality of the evidence The affects on other adverse events Maybe it’s going exactly the opposite way, and benefit So I think the new version of this guidance should say category C does not have to be reported to us unless a comparator analysis by a group that is looking at the totality evidence feels it meets a threshold But threshold, what does that mean? Threshold is not just a matter of absolute or relative value It depends on which event I mean if I find a five fold increase in headaches well big deal Even more critically severe headache big deal I’m not sure the FDA wants to know about it I’m not sure it’s going to change the way a program is going to develop but if you take something more serious then it matters So it has to be on serious events It can’t be on other events Once you do that the safety reporting structure becomes simple and I would say on clinically important events a excess that is clinically important Now how do you define excess that is clinically important? I’d say something of sufficial magnitude and seriousness that would change the course of direction of your development program, could seriously modify a trial, or in the case of an individual patient would be seen as life threatening So you need something that is fairly defined, restrictive but wherever possible use your group data comparator, not non comparator data, and report it And it has to be, and the DSMB should report it, when the net effects on health is adverse So there’s a judgment So, for instance, in trials of thrombolytic therapy if they panicked and reported every intercranial bleeds we would never know it saved lives We’d never know it saves lives And intercranial bleeds they’re pretty serious stuff It is but its rates is one twentieth that of mortality in mort and the net effect on mortality was highly favorable and became one of the most commonly used treatments for a long time and still is in the majority of the world So I think we need to step back, leave A and B as it is, C probably needs a substantial rethink Now the title of this symposium which is staring at me is Safety Assessment for Investigational New Drug Safety Reporting So I don’t want to bring into this post-marketing surveillance That’s a different issue I know it’s useful in certain circumstances but is incredibly more complicated So let’s not even bring that into this Let’s just talk about and usually this mean there are randomized trials going on and that’s what we’re reporting but the word is reporting Most of us in this room have focused the discussion reporting from a sponsor to the regulator but there’s another level of reporting From the investigator to a data management center or a coordinator and then the sponsor reporting back to the investigator and then to its ethics committee That is a much bigger nightmare If you think you’re drowning in paper with that context

you, Bob, you have much better context than an ethics committee has or an IRP has They won’t have a clue what’s happening The second thing is this thing is drowning investigators It is drowning ethics committees and IRBs and I think we should put a moratorium on it Category C should never be reported to investigators or ethics committees unless this independent data monitoring committee says look it’s crossed a threshold The balance of harm exceeds the benefit So when you look at waste in the entire system actually 90 percent of the waste is the traffic between the center and the coordinating center, coordinating center back to the investigator, investigator back to the ethics committee, and then your auditor’s coming in and saying did you have enough oversight to make sure every one of them was reported to the ethics committee? If you could do that that’s a dramatic shift and it would truly reduce the waste and I know this last thing I’m talking about is covered by but wasn’t on anybody’s agenda Nobody talked about it but that is the bigger elephant in the room So please let’s try to think about it The last thing is how would we I strongly believe you don’t need safety assessment committees for ongoing trials instead you need the DSMs or DSMB Plus which is a program being monitored by a DSMB Of course that’s going to raise new questions which are good questions to deal with Now if you’ve got five studies going on are you going to use the same efficacy and monitoring boundary as you would do with one study? You lose the chances of replication if you take the collective data and then you say okay I’ve now crossed three or four standard deviations or five standard deviations You lose the chance of looking at subgroups because the only way, today, we can look at subgroups is having three or four positive trials and you pool the data from those positive trials You’ve minimized your chances of looking at efficacy versus safety and the balance of that in a more precise manner And having different designs because no two trials in the same program are identical you’re actually doing more than replication You’re doing some degree of generalizability So I like the idea of DSMB Plus but let us work to come up with solutions for the new challenges that we will face but they’re good challenges to deal with They’re intellectually satisfying to deal with So I’ll stop here and I’d say category C stop getting it on an individual basis Collect it on your regular form centrally, send it to your DSMB, when it crosses a threshold where harm overruns benefit then send it to the regulators and the definition of that is it’s crossed a threshold where you’re going to modify your program That’s the threshold you would use – [Greg] Great, Gerald? – Okay well thanks So I’m Gerald Dal Pan I’m the head of the Office of Surveillance and Epidemiology at Cedar at FDA and we don’t work in the pre-market IND space under these types of conditions We’re not looking at aggregate IND data So I was a little surprised when I was invited here and I had a little panic call to our internal organizer (Greg laughing) I said well I don’t do this stuff I don’t know what the threshold is But let me give you some perspectives I have from the post-market We deal with a wide variety of data We deal with a lot of adverse event reports in our FAIRs database and when we’re able to use them and we use them a lot for safety label changes is when they’re like type A things And essentially that means the background rate is zero or so close to zero When we have the type C events the bleeding with anticoagulants, et cetera, et cetera there’s really not much we can do with individual reports and we go to things like SENTINEL We go back to clinical trial data et cetera In the type B data where these things aren’t generally drug associated but do have some frequency in the population what’s challenging, and I think what brings up challenge, is, for this discussion, is how hard it is to really understand what the population frequency is of a certain adverse event in patients with a given disease So we’ve had an advisory committee several years ago and about, I believe, some anti-diabetic products

and the incidence of certain cancers in those products and one of our epidemiologists did a great job of looking into this but there’s not some great library out there of pancreatitis, and breast cancer, and colon cancer, and inflammatory arthritis that occurs in patients with disease A, B, and C So it becomes very difficult to generate these kinds of backgrounds from the available data we have They appear in claims data when somebody complains about them and when somebody bills about them And so that doesn’t mean it’s impossible and you shouldn’t try but there’s no single library out there to get the right comparator data Another issue is the time course of the adverse events I saw that Ann’s draft was in person years That’s often a very good way to do things but some adverse events are duration dependent so you may see your first cluster clustering in patients who’ve been on the drug a long time If your person time is dominated by people who’ve just entered the trial you may be misled a bit there as well I think good planning is important The idea of having the blinded committee and the unblinded committee having plans is important and I assume that these plans depend on the drug, its known risks, its class, and what you’re treating it for because in the end, as I think some of our other presenters have said, this really is about balancing the benefits and risks of the drug not simply determining absolute risks What the comparator is is important as I think as a lot of disease areas become very mature and the therapies that are available for them I think it’s gonna be more and more difficult to get placebo control trials and a lot of therapies will be add on therapies So you’re gonna be comparing your drug to your drug plus some other drug and teasing out what that other drug is especially if it has maybe a similar effect or similar adverse event profile will be difficult The next question is what is it you’re gonna do with the data that you have? And I think Mary Ross said no-one’s interested in putting all these INDs on hold That’s not the purpose of this Might you modify your protocol? Might you add some new testing? I think you have to think carefully about that If you come up really something unexpected you might add something to the protocol to look more systematically at something that arose spontaneously within the context of the study And I think for drugs for chronic diseases even if they have a placebo controlled or active comparator control trial many of them often have open label extensions and if there are duration dependent adverse events maybe some different techniques might be used there There’s where the historical comparators might be useful but also very difficult I’m not sufficiently knowledgeable about SACs and DSMBs to make a comment on that and if anybody was looking for what the magic threshold number, (audience laughing) from FDA, I don’t have it (audience laughing) Bob, do you have it? (audience laughing) So those are my comments – Great, thanks to all of the panelists for those comments So one thing that I didn’t hear much of but I didn’t wanna ask the question around but I know you touched on this a little bit when you talked about sort of the issue of multiplicity and particularly in looking at unanticipated SAEs One question is how do you sort of avoid sort of over-reporting unnecessary things? So when there might be an imbalance but it’s not meaningful and it should sort of be not reported versus the ones that really should be reported Just views from not just you but anybody on the panel on sort of how to grapple with that – [Salim] Well I’ll give you a simple answer Don’t report it at all – Yeah well (audience laughing) – Let the DSMB Look I think you’re asking the wrong question Greg That is the problem and you will inevitably get the wrong answer I’m not trying to be difficult I think we’ve started off with the wrong question The question is when can you generate reliable information that is meaningful? If, in the end, you get data and as somebody in the previous panel said oh we just tell the sponsor

you tell us what you’re going to do about it That’s useless It’s like on rounds I tell my residents if you’re going to order a diagnostic test that diagnostic test better make a difference to what you’re going to do to your patient So, in a sense, this is a diagnostic test So if you don’t have a plan what you’re going to do once you get the result why even do it? So I’m very serious I’m not being facetious here and I think it would be a total waste of opportunity at this meeting if we try to drill down on detail nonsense on the wrong question It may not have been part of the original agenda It may be more productive to the scientific community and to FDA to refocus the question – [Greg] Thanks – It’s difficult to interpret some of this data but if you have processes in place at your company, for instance, if you have physicians reading your individual cases, your clinical trials and they see something, or something with another class is reported, or there’s some literature article then the experts in the company, which is generally the safety review team, look at that data and decide whether this is something that appears to be due to drug or not It’s a clinical judgment call You can never say for certain whether it is or I mean it’s looking at everything that you have available and personally I think that’s the portent I do think that we have a quite a high bar For instance, at Merck, I cannot go and say that something, a serious event, is something due to drug without having appropriate evidence to support it because my senior management will go back and think about it again It’s science, it’s process, and it’s planning That would be my answer – We look at a lot of things all the time I mean I think, to what Ann’s saying, we have events that are being examined every day even if it’s not in the context of a specific test So I think the issue of multiple testing I think becomes particularly relevant when you’re thinking about conducting a statistical test but many times we’re talking about simple percentage differences or rate differences and so an absolute difference we would need to think about all the different times that we would look at it – I’m sorry I think it’s important that if we’re going to send an aggregate IND report to FDA we actually have to present it to our senior safety review committee before we send it down or we have to be relatively convinced that it’s due to drug and we’re doing something about it – Yeah – So I think that’s that’s how we prevent– – Yeah, I think it’s back to Salim What you were saying is only report if it’s significant enough to actually change what you’re doing in the program, yeah – And then if you work backwards from that If we agree that is the guiding principle and then start to work backwards then we say okay what kinds of things? How would we assess it? It would then go back again to the comparative assessment ’cause you don’t wanna make major changes to any program without having some pretty good persuasive evidence to do that So, if you take that view, really 90 percent of this problem will go away – I think that we would argue against a single number in all cases– – Yeah – But I think it would really depend a lot on benefit risk and what it is you’re treating and our whole benefit risk framework would come into play here really making these kinds of decisions – Okay See, I told you I didn’t even open it up to the audience and we’re already getting a line So this is a very (panel laughing) participatory audience that we have (audience laughing) Thank you very much Go ahead – [Audience Member] From Abbey can you hear me? – Sure yes – I want to comment on the assigned kind of topic that is the threshold even though there may be a bigger question but I want to comment on the threshold – Yeah – [Audience Member] We have tried to implement two pilots in our company and for that we have tried to establish two kind of a thresholds One step is epidemiological The epidemiological perhaps has, as you have pointed out, may be based on the electronic medical records and all Maybe not that precise because not necessarily resemble the database,

the clinical trial data But at least we can use it as a parameter to measure against, available, all the data we have for all the clinical trial perhaps in the same type of area It help you to at least scientifically around what number is the threshold? The others in the statistical kind of threshold that maybe is not 10 versus one Perhaps is a little bit more complex than that We use both of them to try to establish a threshold to see if an event need to be reported or not But our question again is shall we do it for all clinical trials and for all the products? That’s something that a little bit concerns me because of the amount of resources we take into this activity Just generating the threshold it took about months to get all of these Then to do it across every single reported case is quite difficult Assistance is provided enough by the FDA but perhaps by other organizations about some thresholds at least for measure conditions that would be great Thank you – Any comments from the Or agreement? – I don’t see the point of trying to define the threshold because, quite honestly, whatever the threshold you’ve got huge uncertainty – Yeah, okay – Peter? – Well I’m also not gonna try to define the threshold so start with that point I just wanted to put a perspective forward perhaps as a counterpoint to a certain extent The intent with this is as a starting point to avoid the excessive reporting that we were seeing before where everything that was listed as drug related was sent in regardless of whether there was any rationale to think it really was And that kind of over-reporting of events leads to, unfortunately, really a signal to noise issue How are you gonna figure out which of those hundreds of events that get reported actually are meaningful so we can do something about those events? When we’re thinking about the type A and B types of events I think we’ve all agreed that’s not terribly complicated B maybe sometimes a little complicated It’s clearly the type C type of events that we’re trying to understand Well why are we trying to understand that? We’re not talking about routine, non serious adverse events We’re not talking about tolerability issues We’re really focused here on serious, meaningful events within a trial where we believe there could be a causality The drug is leading to an imbalance An increased risk that’s important, that’s clinically relevant Why is that important? I mean it’s important for pretty evident reasons but let me just make sure that I put those forward because I don’t think is a risk benefit issue per se Well it comes back to that so I’m gonna try to connect that back in just a moment I guess the perspective I’d put forward and be interested in the comments is that what we’re trying to do here actually is to protect the trial and to protect the patients We want to make sure if we have a serious risk that was unknown, that’s induced by the drug that patients are aware of it and particularly important that investigators are aware of it because they can potentially mitigate, monitor, assure the continuing safety of the patients in the trial I think we heard earlier about I think Ajay mentioned the GSK drug where there was a question balance in renal insufficiency events I think that’s a pretty good example If you have a drug where that’s not known and during the conduct of the trial it becomes that there really is a signal and I’m not talking about five versus zero ’cause that may or may (chuckling) not be real But where a careful, thoughtful analysis all of what, Ann what you were mentioning, where an appropriate group looks at the totality of information, perhaps from non serious events, from the laboratory database, from the serious events and their careful scrutiny and concludes that the drug has an effect to induce events of renal insufficiency in patients who are at risk because of dehydration I think investigators would need to know that so they can properly manage their patients, monitor their patients, and help to avoid, and mitigate that risk That potentially can maintain the trial, keep its integrity ongoing, and actually see whether there’s ultimately benefit that outweighs that risk That’s what I think we’re here to try to figure out which is how in the context of ongoing trials where there is a new safety issue and I might point out I think this is (chuckling) generally gonna be type C events, I suspect, are gonna be the least frequent I think very commonly we know the profile of drugs before we get to a phase three trial and those are expected events They’re in the IB They’re already in the consent If it’s second or third in class it may be because we know it from prior drugs within the class So very frequently these are not gonna occur 20 of these type of reports, I certainly hope, in the context of the trial These are infrequent occurrences but we’re talking here about serious, important events for patients and physicians who are investigators to be aware of

And again, I think, to keep the integrity of the trial going we want to be able to make sure that patients are properly monitored and managed in the context of the trial So just wanted to clarify why we’re doing what we’re doing What we’re trying to figure out is, and I think we heard some very thoughtful comments, when do we think that’s a real event versus when do we think it’s the five versus zero that’s just noise and doesn’t need to be reported? That’s not so straightforward of course but thanks and I’d be happy to hear any comments – Yeah, yeah, great framing Any comments on that last when is it a real event? – I don’t know when it’s a real event but I think that I would like to add that to find those relatively few we kinda need to look at everything So it’s just (Ann chuckling) the flip side of it you know? Is that the specificity is gonna be pretty low – Salim? – You never know whether it’s a real event on any individual case but even on an individual case you can put some threshold so that you’re not drowning in noise and you could have a certain level of seriousness, you can have a certain level of causality, but one of the most important thing is the physician has stopped the study medication If they continue it means in their mind, whatever they say, it’s not linked That again will substantially lower the numbers that you’re going to report So once you put it all together in a well run trial at least in cardiovascular disease depending on the duration people may stop drugs at about 10 percent per year Half to two thirds is because they just don’t want to take the medicines but about a third will be they think there is some side effect So you add a level of seriousness to it and the sponsor collects it but you don’t report that to the regulator You then report that to your DSMB who will then take that data and will make an assessment That will then help you So at every level think of streamlining it Thus far the approach is I’d rather know anything that’s happening because I don’t want to miss anything The problem with that is when you’re drowning in noise you can’t see the signal – But I mean I think FDA with the individual case reports, individual SUSARs, you should only be sending down to FDA if you, as the company, think there’s a causal relationship with that specific case So, for instance, at Merck with our IND reporting for serious adverse events we don’t send most of the cases down We only send if we think it’s due to drug and that would obviously then we look at it from aggregate point of view I mean that’s just part of the next step So apparently that has helped a lot especially with oncology studies so they’re not completely overwhelmed with all those SUSARs So we have implemented that – [Greg] Great, next question – [Janet] Janet Wittes, so it’s very rare that Salim and I disagree but, in this case, we do (audience laughing) It seems to me that if one follows what the guidance is saying you’ll reduce the paperwork by 90 percent automatically because of what Ann just said There has to be some causality The difference between the treated and the control group has to be big enough for it to be convincingly bad So I think the guidance actually, if people follow the guidance, then there would be a lot of decrease in reporting In our paper that several of you cited and I feel like those of you who are statisticians are gonna say why in the world did she do this? We said forget multiplicity because you can’t deal with multiplicity at all There’s too much Think of it on these individual basis, individual events Individual events that are merged a little bit Not preferred terms but high level terms Things that are like each other are put together and that’s the basis for making a decision about reporting but the idea is not to stop the trial The idea is not necessarily to change anything It’s to provide the regulators with some information because remember they have information from drugs from the same class to be able to put things together in a way that nobody else can So the other thing that I wanted to say at the last session is about DSMBs There’s, and I tried to say this in my presentation,

that DSMBs think of risk versus benefit and this is not a risk versus benefit question This is a risk question What is the drug? What are the adverse events that the drug is likely to be causing? And so if this becomes a responsibility of the DSMB it will require a major change in their culture, in their way of thinking about because the question will be what is the risk independent of the benefit? And that we should be reporting out Right now I think what DSMBs do is be very cautious about reporting something out ’cause they don’t want to hurt the study and if the balance of benefit and risk is good they’ll be quiet Only when something is so bad that you have to change the in form consent or you have to change what the docs know and only when they’re really pretty sure, and how do you like my technical word pretty sure? (audience laughing) (Ann laughing) That what you’re seeing is real only then do they report out Well if they are to take the responsibility as the guidance describes they will have to be reporting in a very different way and I think that’s a very large cultural shift – [Greg] Great, any comments on that or? – I’d say let’s have that culture shift (audience laughing) (Greg laughing) – Next question – Robert Baker, Lilly First, before my question, a clarification to the point that Dr. Wittes just raised At least for our company we mail to investigators and IRBs based on the controlling regulatory body so, indeed, the FDA’s IND reporting role is making things better for IRBs and investigators in the United States whereas the flood persists in other geographies My question is actually a repeat of the one I asked the first panel but I think it would be quite relevant and probably echo some of what Dr. Yusif has been saying today which is it would be useful as we get to final guidance to step one step higher to really articulate, clearly, what we are trying to achieve and what we’re not trying to achieve This is really for the agency I guess and what Dr. Stein just said I thought was quite clear and aligned with how I would think about it It’s not necessarily clear to us in the draft guidance nor, although perhaps I misunderstood it, Dr. Temple I’m not sure it was exactly what I took from what you said before lunch because I think, for us, the question is is this about the subject protection or is it a bigger requirement in our drug safety evaluation? So, in SIAM’s sort of terms, well firstly for subject protection it’s a point in time Eventually the studies end and for product level conclusions we will have better data, we’ll have better analyses, and we’ll have free unblinding of all the experts to make those determinations so if we approach this instead about protecting subjects in the study then in SIAM sort of terms we would approach it looking at imbalances but not trying to find and report all imbalances but trying to find and report important potential risks would allow us to immediately filter events that aren’t important And then have a secondary judgment that the data as we look this up to a potential risk and not simply a numeric imbalance – [Greg] Thank you, next – [Audience Member] Listening to some of the reactions I think there’s at least some people that are adhering to an efficacy mindset here We need to reimagine this with a safety mindset Efficacy is testing and confirming with p-values and p-values don’t really make any sense here If we did 100 tests blind we’re gonna be wrong five percent of the time You can make adjustments with FDR or something like that You’re gonna be wrong some finite amount of the time What we wanna do is we wanna leverage the multi-disciplinary nature of the safety management team Use statistical experts, use clinical experts, partner together and really try to understand these safety issues So if you recall the plot that was up there for example You’ll notice that it didn’t have just one threshold ’cause if it had one threshold that would imply that you crossed that then you have to do something So that’s like a decision role We have three thresholds on top plus two reference lines on the bottom there So you have the background rate, you have some critical rate determined by the multi-disciplinary team, and you have three thresholds there And at any point in time you can see where you are The level of evidence for a risk elevation at any point in time and across time you can see the trend

So we’re not trying to come up with a test We’re trying to come up with a quantitative framework to help the whole multi-disciplinary safety management team understand the evidence and that data and drive the decision making process with medical judgment So we can’t take efficacy analyses and re-align them for a safety analysis We have to develop them from the ground up here Sorry not a question again – It’s okay (audience laughing) (panel laughing) We don’t have time for answers anyway so Bob? (audience laughing) This isn’t anything that Janet and Peter haven’t already said to some extent but I wanna be explicit because Salim keeps bringing it up One of the major purposes of rewriting the rule was to do exactly what you wanted to do We were getting all kinds of crap because people would just send it in I gather they still do that for the Europeans and I guess they’re happy enough with it and don’t care We wanted these large numbers of things to be screened for the likelihood or possibility, reasonable probability, whatever it is, that they actually were something the drug did So the whole point of it is to do exactly what you want everybody to do and not report it Not only not report it to us but, when it gets reported to us, it has to then be reported to all investigators who then report it, under their obligations, to report it to the IRBs who then put it in the garage and never looked at it We knew that but the primary purpose was to do exactly what you wanted to do I just wanna be sure everybody knows that (Ann laughing) – So we’re on the same wavelength Can I– – [Greg] Well we are out of this time for this session Maybe 30 seconds to you and 30 seconds to– – Okay, the one thing– – Chris – That this DMC Plus doesn’t quite solve is the idea that across different compounds, in the same class, how do you look at it? And this could be an argument for getting things to the FDA sooner than later The only problem with that, different agents of the same class, especially on adverse events can have quite different profiles If you remember Ximelagatran and liver failure it wasn’t seen with the other agents so I think there you want to look at it class by class because with Ximelagatran had they taken all their trials and looked at it they would see the hepatic failure thing So I’m not sure that’s that level of looking at an entire class across different sponsors is that important as you’re developing your program Sure at the end of it you would So I don’t think that should wag the tail of the dog here but– (Bob murmuring) okay great then we’re in agreement – Great – Let’s go home (audience laughing) – 30 seconds to Chris – [Chris] So just a brief comment I have it on the record for international harmonization and Bob you mentioned this but this is a reason we get from some sponsors on global trials– – Yeah – [Chris] For the need to collect SAEs in an expedited fashion is like B-Pharm demands it and they’re a company that needs the German market and they don’t wanna upset them and so we really need, also, to bring in our international regulatory colleagues to help make sure we can implement these more rational guidance And I think we all wanna give the FDA a lotta credit for being the leading regulatory authority in the world for thinking through how to do this more rationally – Okay great I’d like to thank the panelists and the presentation We’re gonna go ahead and take a 10 minute break If we wanna finish on time today be back here exactly at 2:45 for us to start Thank you (crowd murmuring) Okay welcome back everybody

In this last and final session

we’re going to discuss key data and methods issues associated with conducting aggregate analyses of patient level data pooled across the study arms and even multiple studies within a program as we’ve been talking about all day long Joined with me on this panel are Mary Nilsson, Research Advisor, Safety Analytics Global Statistical Sciences at Eli Lilly and company Asif Haque, Director of Patient Safety at AstraZeneca Bill Wang, Executive Director of Clinical Statistics at Merck Research Labs and Mark Levenson Director of the Division of Biometrics 7 at FDA I’m gonna first turn it over to Mary who’s going to open up this session with a presentation followed by panel discussion and then we’ll turn it into the audience discussion as well

Thank you Mary? (papers rustling) There – I’m happy to be here to talk about data pooling I’m actually quite excited that this ended up as a session at this workshop I think sometimes this topic gets underappreciated so it’s nice that actually we have an opportunity to talk about this So I’m gonna talk about some of the common pitfalls that we sometimes see when pooling data from multiple studies, and talk about best practices, and trying to avoid these pitfalls And I’ll also talk about how Lilly incorporates best practices in the IND safety reporting process to avoid these pitfalls So as we’ve been talking about we’ve all been focusing on type C This presentation will also focus on the type C type event So we’re talking about the aggregate analyses I’ll be focusing on the case where you have a concurrent control and also the assumption is we will have unblinded data So I guess the challenge we’ve been dealing with through this whole workshop is trying to figure out how to right-size this effort It has already been mentioned We know that it’s a difficult thing to try to figure out whether an event is reasonably likely caused by the drug It’s very difficult even when you have all the data at the time of this mission So when sponsors put together a proposed label even when you have a lot of data it’s very difficult So you get aggregate analyses from statistical tables but it also involves maybe multiple aggregate analyses Or it includes a lot of medical reviews You look at events that are similar to each other You look to see if there are events in drugs from the same class You look at all sorts of things when trying to figure out if an event is reasonably likely caused by the drug So how do you right-size that effort when you don’t have as much data? So the point of this presentation isn’t so much if you do that during the IND safety reporting process or who does it whether it’s an internal group, or a DMC, or a DMC-like group The main point of this presentation is trying to say whoever conducts these aggregate analyses it needs to be done in a very thoughtful manner We don’t wanna get in a situation where decisions are being made using summary tables that fall victim to some of these very common pitfalls So one pitfall when someone tries to simplify this process or try to make it easy to look at data from multiple studies is it’s very tempting to just take all the studies in a clinical program and just pool it all together What happens in this case you can get in a situation where study and treatment or treatment arm are confounded You may end up with a table where one treatment arm comes from one set of studies Another treatment arm comes from another set of studies and when you have that basically study and treatment are now confounded So when you see a difference you don’t know if that’s because of treatment or it could be because of study More importantly if you don’t see a difference that treatment difference could be washed out because of a study effect So here’s an example of one pitfall This example comes from the literature It comes from a 2002 article so these pitfalls have certainly been around in the literature for awhile The problem we’re seeing is not everybody is aware of these pitfalls or the right people who are conducting some of these analyses may not be aware So, in this example, there are six studies Four are placebo controlled studies and two are active controlled studies So if someone tries to simplify the process so much and just simply pool all six of these studies you often get a table of events or look at the last row You might just look at a total count and a total percentage So, in this case, you might see the percentage of this event is 1.3 percent for your new drug and 1.3 percent for placebo You’ll look at that and say great That’s even I’m gonna move on to other events But, unfortunately, the underlying data will tell you a different story If you just looked at the four placebo controlled studies you’ll see that at least three out of the four studies actually showed an increased risk So by pooling this data you lost that signal The next two examples are just trying to show that even when dose arms are confounded with study you can get into some misleading results In this case the placebo group, the estimate, comes from two studies The low dose group comes from one Middle dose comes from two studies and the high dose group comes from one study Within study it’s very clear that

there’s absolutely no dose response They’re just equal across doses However, when you pool this data, it actually looks like there is a little bit of a dose response and so if all you’re doing is looking at this last row you’re gonna have people trying to look into this event further and basically probably wasting some time trying to look into this when really it was just an artifact of how you pooled the data This is very similar Just dose arms are confounded with study but this shows that it can go in either direction So this is an example where the first study does show a dose relationship The second study shows a dose relationship but when you pool the studies it actually gets a little less clear It just kinda jumps around a little bit and it may not catch your attention quite as easily So another thing that can happen when you try to simplify the process too much is you can create a summary in such a way that you don’t account for study differences or you don’t adjust for study So even if all your studies have the same arms of interest you can still run into a potential pitfall So here’s an example of that So here are three studies where you have all three studies have your drug and a placebo control So you think you’re great Well you’re directly comparing these things and sure enough you can run into yet another problem You avoided the confounding problem You’re not confounded with study however you still have an issue because of unequal allocation or randomization across these studies So even though the percentages are the same in each one of these studies when they’re pooled it actually looks like the percentages are different And this is the Simpson’s paradox that has been brought up already in this workshop It was mentioned as a potential problem to look out for So what are the best practices? So again it doesn’t really matter who’s doing this or when it’s being done These are best practices that should always be thought of when pooling data from multiple studies We should just try to avoid confounding when we can and we usually can I think if you don’t have very many placebo controlled studies or active controlled studies you might be in a different situation but for most compounds you can avoid confounding You also need to adjust for study in some manner Now this could be done various ways One way is to just never pool studies and always look at just the individual studies You could just always look at that Now that’s not a bad way to go about it if you’re just looking at a handful of events or you know which events you’re trying to follow but if you are trying to create a system to monitor a whole bunch of events like if you are trying to look at all SAEs to see if there’s anything within a balance that might be difficult to do when you have a lot of SAEs to look at A forest plot’s very nice It’s another way to look at all different studies and then maybe a pool of summary However that also can be challenging if you are trying to look at a lot of events but it’s a very reasonable approach if you don’t have very many events you’re looking at Other techniques include study size adjusted percentage, Mantel-Haenszel odds ratio, Mantel-Haenszel risk ratio These are all methods that can be used to help adjust for study And now I know p-values are controversial Our confidence in their use is controversial but some of these can be useful to at least help you determine if you’re in a paradox situation If you have a Mantel-Haenszel odds ratio of one but your percentages look different that will tell you you’re in a paradox situation and the other direction can occur as well So most of these methods for adjusting for study is more in the context of when you have a fair number of events If you are talking about just a handful of events or if all your events are occurring in a uncontrolled phase or extension phases then different things are needed You may need to just look at the individual cases and medically review the cases You might be able to create an exposure adjusted incidence rate for the entire exposure for your new drug and compare that with literature and do the best you can or with some background rate That gets a little trickier but at least when you have a fair number of events we can certainly do better than some of this crude pooling So let me just take one of the examples I mentioned the study size adjusted percentage That’s one technique that can be used to help avoid some of these paradox In this case we are going back to this situation where you have three studies All have the new drug and placebo and all their percentages are the same So if I crudely pool this data it looks like the percentages are different However the study size adjusted percentage actually just weights the data in a slightly different manner It makes sure that the weight of the new drug and placebo get the same weight within a study

It still allows for the larger studies to have larger weight but at least within a study it ensures that the weight provided to new drug and placebo are the same So when you use this technique you actually see the percentages end up the same It’s 26 percent and 26 percent We think this particular method is probably underutilized I don’t think very many companies use the study size adjusted percentage very often but it is a pretty intuitive method to try to get more and more into predominant practice So briefly how Lilly incorporates this thinking into our IND safety reporting process In our case we do have surveillance teams and safety monitoring teams already looking at blinded data and when there is an event that they need to have unblinded or, if it matters, that it needs to be unblinded that goes to a safety internal review committee So this committee just receives a small number of events They’re not trying to look at a whole bunch of events at one time In this case the data that’s given to this committee comes from the study team The cases and the exposure numbers The denominators by treatment and by study comes from the study team And for the blinded studies the treatment arm denominators are then estimated We’re not unblinding the studies just to get an exact denominator and then a different team, the case management team, can give the treatment assignments for the cases to this committee and then this committee will then run the analyses and determine if it’s a reportable event In this case it’s easy to help keep these best practices or avoiding these pitfalls ’cause it’s a pretty small committee We have very experienced statisticians on the committee and they know how to look at the data They know to look at the data by study or to create some of these more advanced methods if needed So just, in conclusion, no matter who runs these analyses or how often or whenever they are done the idea is that whenever they are done it should be done in a thoughtful manner and statistical thinking is still very important Unfortunately we still have a pretty big educational gap on these pitfalls Maybe people here are aware of these but I could tell you a lot of people who are conducting these analyses are not aware of these pitfalls And I’d say the educational gap is cross-functional You might think all statisticians should know this and that may or may not be the case I certainly did not learn this in school or if I did learn it in school it didn’t sink in Now, of course, that was awhile ago and maybe it’s getting better but we still see statisticians coming into industry without the knowledge of this kind of thinking Sometimes it is on the job training when it comes to these kind of lessons learned Certainly for me I was really early in my career one of the first things I had to do was look at an adverse event across multiple studies and created a table, colleague of mine and I went to my senior director and showed the data and said hey this looks great It looks like our event is the same or better compared to placebo and active control And the first thing he said was are you sure there’s not a Simpson’s paradox going on? And I’m like what is that? (audience laughing) So I had no idea And his name was Charlie Samson so I thought he said Samson’s paradox, right? (panel laughing) (audience laughing) So I just kinda giggled and he’s like why are you giggling? And I had no idea what he was talking about So I learned pretty quick what a Simpson’s paradox was but not all statisticians will A lotta statisticians are in the world of individual studies for quite some time before they get into the chance to look at integrated analyses So it’s not something that you should assume your statisticians know So we do have an educational gap So for the people who do know these things I guess maybe help with the education The idea is whoever’s doing these analyses we need to be very thoughtful And there is a need for more innovation to try to make some of these methods easier to do and that’s true not only for the IND safety reporting process but even at the end at the submission time We need to do better about making this easier to get these better methods used more broadly (panel applauding) (audience applauding) – Okay thanks Mary We’re gonna go ahead and turn it over to Asif for some comments – Yeah, thank you Mary, lovely presentation I think your presentation really highlights the pitfalls and the challenges that exist with pooling data from different types of studies Some of which have placebo control, some which have an active arm, and also duration or exposure studies Without unblinding cases in the safety database for aggregate analysis we can look at also patient exposure and calculate the incidence rates

and also look at literature rates which you already highlighted While I’m not a statistician I still read up a little bit and learned real fast like you said (audience laughing) about the best method So I came across this really interesting article, a literature article by Jackson and Theo in Wiley Statistics And this talked about the best measures to meta-analysis in trials of differing types and one of the measures which they had suggested is that the random effects meta-analysis which can be used as a random effects model because this model will allow for true effect relation between study and study and also within the study and also attempt to correct for it The odds ratio and relative risk rate is similar but if the event rate is small therefore it will not give a correct answer The pooled estimate of the log odds ratio and the corresponding confidence interval can then be transformed into an odds ratio scale However the approximation made by the conventional random effects model can be poor When the studies are smaller the effects are rare The random effects approach also allows generalization to the wider population and the fixed effects approach does not which is a significant limitation of this approach I think we need to decide upon, hopefully with the agency’s approval, a consistent approach to how to match and how to look across studies which are blinded and unblinded and also for differing types That would be my thoughts – Okay Great, Bill? – Yeah, so first thank you and congratulations Mary for pointing out the important issues In some way I feel this pooling is like great, quantitative tools and it’s a tool like a hammers And sometimes when you have a hammers we tend to find the nails I think Mary’s talk really gave a great example of that If you don’t use a hammer carefully you may hurt yourself. (chuckling) You may not only not hit the nail you actually can hurt yourself So that really make me try to link to what we are doing in this effort in the safety assessment and link to some of the work we are doing at Merck that Ann presented but also link to some of the things we are doing Quite a few people from our ASA safety working group in the audience I happen to be a co-chair of that Some of the work we are doing in thinking about different things I think there are a couple of things when we think about the poolings One is we have to understand what question we try to solve It’s important to really clarify the question, to get the question right, to really clarify what is the nail I hear we have a lot of talk whether it’s Janet using the wheat versus chaff Some people using the chipmunk versus wolf In my paper, I have a paper talk about canary in a coal mine, but some people are using benefit versus risk and I think, Bob, your first talk was about whether there evidence to suggest there’s some relationship? And I think Tom used the words signal versus the noise At the end of the day I think it’s really need to clarify even though, in the last session, I feel we reach agreement but I think there’s still some ambiguity there What exactly is the question? Is it a risk question? Or it’s a benefit risk question And that actually is a important differentiation we need to clarify I think Janet in your comment from the audience you think it’s a risk questions and some people think it’s a risk benefit question I think it’s very important to clarify that question very quick, of critical importance, ’cause that would determine what you’re going to do, how you’re going to pool So that’s why in Ann’s presentation the first box of our process is to clarify the questions, to clarify what are the population, to clarify what are the variable of interest, to clarify what are the context that question is that you currently want or that don’t want and to clarify what’s the metrics That’s the importance of the first box We call it a safety estimate and we need to clarify that in order to get a good evaluation because when you have that question clarified you can actually plan for your pooling You can plan what population are we talking about? Are they poolable or not? What event are you pooling? Are they adjudicated so you can actually pool them together? An then what are the inter-current backgrounds that actually make it poolable? So that’s an important part

To ask the right question To plan for the pooling Then when you do the poolings there’s actually many ways of poolings We need to be careful and really think about There’s the pooling across studies and Mary talk about that There’s pooling across time dimensions I think there’s a few examples say pooling early versus pooling late actually can make a big difference There’s also a question about pooling across sub populations and I personally work on ICS at E17 We spent a lot of time talk about pooling adverse event across regions and that early planning is very important in order for you to pool the event togethers Now, of course, in this particular setting it’s really ongoing of the pollings so there’s a couple factor we have to also consider There’s that speed of data collection The so called velocity There’s also the volume of the data collecting If you look too early there may be a risk but if you look too late it defeat the purpose There’s also the variety Different type of adverse event Then also the quality that Dave, I think, Dr. DeMets mentioned very highly That’s something we really need to pay attention to make sure that data are collected and cleaned in a timely manner and coded in a timely manner so when you pool it’s pooled in a meaningful way ’cause sometime you need to pool across preferred term into a high level term If you don’t code them you cannot do that So you cannot make those assessment early on and I think another important part is when you try to make a decisions using pollings it’s not just going to be one number to help you make that decision There are many ways you can do it Single number is something we tend to We get used to but with our technology we can do a lot of thing with poolings We can do pooling, we can do dynamic, we can drill down, we can lower I mean all those things I think can be important part of this pooling exercise That’s why we don’t feel a quantitative framework is important Have some kind of threshold guide is what’s important but it’s not going to be one number It’s a set of framework allow you to look at the data and be able to look up and down, pool left and right, to make you that holistic judgment And that’s also why, you know, ASA safety working group we actually have multiple working streams working on all these kind of thing including a dynamic viralization I’ll stop there – Thanks for all of those comments Really appreciate it Mark? – Okay well thank you First I’d like to commend the industry statisticians both on the panel and we heard in the audience I think they’re providing a lot of leadership in this area to flesh out the statistical aspects of this guidance and I think we’ve heard a lot of good ideas I think a lot of what I’m gonna say has been said but it probably functions as a good review here So I’ll point out mainly challenges and perhaps a few solutions First I’d like to point out that we do have an upcoming guidance on meta-analysis for safety We’ve been working on this for some time Ms. LaVange worked on it during her years at FDA This guidance actually addresses a lot It’s not focused on this general surveillance problem It’s focused on drug safety when you’re trying to evaluate a small number of hypotheses but a lot of the principles I think overlap and I’ll review some of those principles and they have been mentioned a lot today First of all one of the biggest principles is to stratify by trial which was really the subject of Mary’s talk Crude pooling especially in the circumstances we’re into can lead to all sorts of problems and I think Mary demonstrated many of those problems Another point we make in this guidance is on outcome ascertainment We need good, consistent valid estimates for outcome The more homogeneous across trials the better but a lot can go wrong if you’re not focusing on obtaining the outcome correctly Hopefully perspectively but that’s not always possible because sometimes safety events are sort of discovered retrospectively What we heard, I think, from Bill just now is the importance of planning Planning and pre-specification Now that’s sometimes difficult in safety It’s certainly not the same sort of planning we might see in efficacy but the more upfront thinking and pre-specification is gonna go a long way to making the process more objective and hopefully improving the process So pre-specification of what trials to include, what outcomes to look at, the analyses, the thresholds,

all that is very important Another point this guidance makes is on the totality of evidence In safety I think we’ve heard just a few times today a single p-value is often not very useful So the totality of evidence would include like the effects side, the consistency across trials, and also the plausibility of the results So that’s another point and the final, last point I think that’s relevant from the guidance is having the appropriate expertise that includes the clinical, the epidemiological, and the statistical expertise all in place, all to draw from And another point, I’m sorry, one more point is the clear objectives as Bill pointed out Have clear objectives upfront Okay now let me move on to go beyond that guidance which, hopefully, will be available somewhat soon to some of the challenges and, again, we heard a number of these challenges And the first challenge is pooling across different study types So this may be different patient populations that are different baseline risks That presents challenges Pooling trials of different designs particularly different follow up Do you combine a short term trial with a long term trial? Well it probably depends on the outcomes you’re interested in and it’s important to use the right statistical measures that don’t mess up the combination of short term and long term trials For example exposure adjusted estimates or maybe you’re looking at, you know, base hazard rates or so So different follow ups presents the problem and I think we heard something about extension trials today Often, from our experience, we see safety issues at least the counts appear in the extension trials Perhaps that’s what they’re there for, those extension trials, to get the long term safety information But there are often uncontrolled portions and then we’re often left with a bind that we never really get out of Another issue with combining trials is what we call the small versus big trials We often see, in the kinda meta-analysis safety world, there’s some big trial that highlighted some issue so let’s do a meta-analysis to combine this big trial with a whole bunch of small trials and basically, because the big trial dominates, you’re just getting the same result back So it’s important that the information is not diluted If small trials are relevant they may need to be looked at separately And of course you may, at the end of the day, not choose to pool at all You may feel that the trials are so different they have to stand on their own but nevertheless, looking at things like forest plots and other summaries across trials is still useful So then there’s the challenge with ongoing trials This is what came up throughout the day with terms like multiplicity and multiple outcomes and I believe there was comments that we’re not trying to do efficacy here so the standards are different But nevertheless this is a challenge that we’re ongoing trials, we’re probably looking at them repeatedly, and we’re looking at multiple outcomes So we need systems and methods to address that and we heard things like false discovery rate There is things like cumulative meta-analysis that’s designed to address this and, of course, sequential Bayes halos as well But, again, this is another area where pre-specification can help If you decide ahead of time how this is gonna be handled then it becomes less arbitrary during the process Brings me to the last challenge which I don’t have much to say about is the challenge of using blinded data It’s almost impossible to aggregate blinded trial data by treatment arm Naturally So I mean (Greg laughing) you can come up with like worst case scenarios like if this fell this way in this case You can play around with bounds with the worst it could be But, you know, the idea of aggregating with blinded data is not a very reasonable thing to do So there you’re probably gonna be in that situation that was described with the two stage approach You might look to see if their overall rate is beyond some threshold that would force you or encourage you to unblind the data So just two final comments is, again, we’re not in the confirmatory exercise here This was said earlier but we can accept false positives This is not the end of the story and even if it’s a false positive forever

it doesn’t mean, necessarily, the death of the drug So we’re not in the same situation as efficacy We’re willing to accept some mistakes and hopefully those will not have major implications And the alternate is the balance between, of course, the false negatives and that finding important safety issues so, of course, that’s always a balance So I’ll end with something I’ve said throughout that we heard on the panel that upfront planning and documentation is important What events that you’re gonna consider, the trials that are gonna be included, the thresholds, the analyses, the procedure, when to unblind, the more thinking upfront the more objective this process will be Thank you – Great thanks Mark A comprehensive review I’m looking forward to that guidance to come out (Mark laughing) So and I appreciate the sort of the range of challenges that can come in with polling that you brought up Polling trials of different patient populations, different study designs, different follow-ups, including on-going and completed All of those things and even different trials of different sizes but, Mary, you pointed out a lot of the pitfalls in doing this and the study size adjusted method might address that latter one But let me get a sense from the panel on where are we in the state of the art of analytic tools for pooling in terms of really addressing these challenges? Mary you did mention that there are tools available and they’re really good ways to do this it’s just educating the statisticians, getting the best practices out there But are all of the best practices there? Is there still room for improving on the analytics side? Do we need new tools and better tools to pool these data in a more appropriate way? Thoughts on it? – I don’t have a microphone – You don’t have a microphone (panel laughing) (audience laughing) Don’t you have, weren’t you, oh, okay, no Okay, why don’t you stand here? – Okay – Okay I’ll go like that – I do think there are a lot of tools available today that are underutilized I mean that’s true The study size adjusted percentage is something that we’re not aware of too many people using that at all I’m part of a PhUSE working group and it’s the FDA, industry, academia collaboration and there is a project team within those working groups working on standard analyses for safety data And there’s one whitepaper, the adverse event whitepaper, that does advocate for a broader use of the study size adjusted percentage Just, in general, when doing aggregate analyses but even more so they could most likely even be used in labeling Some of the percentages in labeling come from these crude percentages so even if our labels sometimes we have percentages that could be improved upon So we think that’s a pretty good technique that can be used today The Mantel-Haenszel methods are great today I think the challenges that still could use a lot of innovation is when you don’t have much controlled data what you can do What you need to do when you don’t have controlled data I think is definitely open for a lot of additional innovation and thought in how best to look at the data when you’re in that situation But for controlled data I think we have a lot of good tools already available and I think it is a matter of just trying to get some of these practices into predominant practice And we’re trying This PhUSE Working Group we have the whitepaper Also my supervisor, Brenda Crowe, created a YouTube video on what the study size adjusted percentage is It’s a 10 minute video clip that’s intended for a broad audience It’s for even non-statistical audience It’s a pretty easy video to understand so I think a lot of it has to do with education but definitely more innovation’s needed outside of the controlled setting – Following what Mary just said I think the meat of the art is consistency and agreement on what exactly is the approach that’s required Like you said there’s so many approaches available There are tools for, again, specifically when we come to the safety assessment for an IND safety requirement At that time we need to know exactly what we need to do and which should be the most acceptable for regulators Because when we talk to our stats department and they’ll come back to us and ask what do you think is the best way to group? Because there isn’t an aligned view right now So I think the meat of that is just to come to an agreement on what would be the best way to do it Thanks – So when I think about tools and many times we get to think about the tool about uncertainties But I actually feel a big opportunity there is not only about uncertainty it’s really clarify the ambiguities So when I think about tools I think about tool in a more general sense Do we have tools to help us make things be more clear?

Clarify the questions first before you think about how to address the uncertainties So for that, actually, I feel the recent work on the ICH particularly on the E9 side using the estimand framework even though they focus on the efficacy but that framework is very helpful even in safety settings to really clarify the question particularly in this setting Are we talking about risk? Are we talking risk versus benefit? So I think that type of tool, that general framework of tools, is important to think about before we get down to the uncertainty tools Now on the uncertain side with all the technology, with all the visual tools, there’s many thing available and particularly in the area of dynamically visualized things To see things in a so called dashboard way and to be able to drill down real well That, I believe, can really help a lot in the decision making Not to mention a lot of statistical tools – [Greg] Great, question? – [Salim] It’s a comment This session’s almost deja vu (Bill laughing) I could have rebound to 1980 and have these points actually said The principles, the methodology, the statistical approach, the display of meta-analysis is 40 years old and there must be 10s of thousands of papers on it That’s exactly the methodology you need to use Never pool across studies Always look for differences within studies Calculate variance between studies Wait for the size of the information which is not just the number of people but the number of events Make sure your definitions for your event are reasonably comparable There’s a huge literature – Yeah – [Salim] And it’s wonderful you’re discovering it now (audience laughing) (Salim laughing) So please this is not new Go back and read the literature and I’ve been involved in meta-analysis since the 1980s and the best way to do it is an individual subject approach You cannot do it for every SAE or AE You just drown in it and by doing an individual patient approach you can actually do a time to event analysis So you don’t get into the issue maybe things are happening early in some cases In other cases it’s happening late You can actually study it So to the gentlemen and ladies on the podium please go and read the literature – Can I add to that? – Sure go ahead – Yeah I think Professor Yusuf you are right This too has been there for awhile and now we have an opportunity to use it smartly with a new set of questions So, for example, in this particular setting, it’s not what I call a static meta-analysis It’s a dynamic meta-analysis where when you have data that’s moving ongoing with this ongoing data when the evidence still accumulating that type of decision making will be different from what I call end of all their trials Now you want to do a meta-analysis So that’s something, I’m not saying it’s not there, but it’s, in this particular setting, it cover unique different problem – Mary you seem to have– – Okay – [Greg] Read the literature, you wanna? – Right. (laughing) (Greg laughing) You know you’re right It has been around for a long time and, as I mentioned, I learned pretty early in my career a pretty big lesson. (laughing) And that was a long time ago So the challenge we’re having right now is trying to get good pooling practices into predominant practice and we still, even though we’re years later, we are still struggling with that challenge So I think there have been conferences There’s a lot of literature There’s a lot of word out there but somehow it’s not getting to everybody who needs to know because I still think we see a tendency to just pool data, do crude percentages, and look at the data that way And, unfortunately, that’s still part of predominant practice so I think we’re all looking for help to try to impact predominant practice and I know the medical field has similar issues Usually it takes 15, 20 years to go from a medical innovation to actually have it be part of predominant practice in the doctor’s office So I know other fields have similar problems but it feels like it’s time, right? We say this in the PhUSE adverse event whitepaper It is time to use these methods, right? (laughing) It is past time to be using these methods and so I couldn’t agree with you more They’ve been around but we’re still struggling with getting it into predominant practice – [Audience Member] I have a couple of minor comments I’m actually old enough to remember the stuff from the 1980s (audience laughing) But this is more dynamic than a traditional meta-analysis so that’s one So you’ve got this cumulative aspect that you wanna know when you’re supposed to stop and do something so that makes it different It’s more like quality control– – Yeah – In that sense

and then the other comment I have which no-one seems to have addressed is meta-analysis you have to have a certain amount of caution when the events are rare because so many of the processes that are out there use large sample approximations And when you talk about big trials that’s fine I’m sitting at FDA now, working in the rare disease space, and those trials are tiny So, we need to recognize that it’s not one size fits all and that’s one of the cautions I have – Just– – Can I just give a– – [Greg] Just an observation, you know, we sort of and Mark you probably, in the post-market world, and you’re talking like pharma co epidemiology and so many analytic tools to look at real world data And the uptake of the newest methods, in comparison, seems to be really fast Propensity scores and different ways of doing that and even artificial intelligence, machine learning, and all of those things people seem to be really wanting to learn really quickly But then we go back to good pooling practices and it sounds like techniques, the best methods from the 80s are still not being used Just beyond education I’m just wondering in the back of my mind like what would it take to increase the adoption of these best methods? – (laughing) Well I agree that the methods have been around for awhile The understanding of them has been around for awhile There are some unique aspects, as pointed out, the accumulative dynamic nature We’re also dealing with the rare event situations which require different statistics but I think there’s a lot of why meta-analyses principles aren’t always used is probably similar to why good observational studies principles are not always used There’s a lotta data out there It’s easy to analyze and a lot of people don’t have the necessary training In terms of overcoming that I don’t know I’m not interested It’s not my job to make the whole community but we hope that our guidance will have some effect in the drug regulation world – [Greg] Yeah, great, next question? – [Audience Member] This is a very small historical note (Greg laughing) Simpson’s paradox was first described in 1899 by Pearson and then again in 1903 by Yule and named Simpson’s paradox in 1950 So these are pretty old methods (audience laughing) – And I just wanted to add there’s a large literature in what’s called cumulative meta-analysis Tom Chalmers proposed it in the late 1970s We wrote papers about the problems of cumulative meta-analysis and how you have to use the concept of optimal information size and monitoring boundaries to assess it You have to think about hypothesis generating trials, separate that information off, then look at complimentary data You call it dynamic It really is being tested under the concept of cumulative meta-analysis so I think this process can benefit by a good shakeup and thorough delving into the literature Huge methodological thought has gone into it and, in fact, some of the meta-analysis we did in the 1980s was on safety, on thrombolytic trials It wasn’t just efficacy – Oh, okay – Just a quick comment I definitely agree I think there’s a human nature that always wanted the best toy has come up and you forget about what– – Shiny things The shiny new toy (audience laughing) – The shiny part So I think you are right and some of those toys has been there for long time It really defeat the purpose It’s something we forget We only think about the new thing coming up – [Greg] Well maybe Mark’s guidance will have a big shiny– – Yeah we wait for your guidance (audience laughing) – [Greg] Next question – [Susan] Thanks, I’m Susan Duke I’m currently at FDA I haven’t been there that long I have been in industry for quite a long time though and have had the opportunity to lead and be involved in industry wide working groups for quite some time And my comment today is about culture and how we work together So, actually, a colleague of mine who’s sitting here, Ming Chun Li, she works for a small non-profit She’s a clinician and she and I met at DIA last summer and, actually, I’m part of Bill’s working group and we had been looking for safety clinicians to work with A few months Ming Chun made a little cartoon and in her cartoon she has a stick figure in the left

It’s a safety clinician at her desk going hepetatoxicity, adverse events, ECTs! Where’s the signal? And then there’s another panel with a statistician that says I make perfect methods I make beautiful graphs and the panel at the bottom says, the two people are looking at each other saying, we could work together So I think part of this is, each within our own silos, we’re developing things and we’re figuring things out and, as I have been working on the working group that Bill started, and now that we’ve added safety clinicians into that group it’s amazing how much we are able to support each other with our respective knowledge And how can we can see things Ming Chun came to a methods webinar in ASA Biopharm last fall and she said I want to know more about those methods The statisticians thought only statisticians would want to know that So just giving a few examples of how we can really as all of us are interested and all of us have these different perspectives on safety how we can, if we’re working together, we can really get this to where we would like to go and we don’t have to be so frustrated about methods that have maybe been around for decades or centuries not being put into good use – [Greg] Yeah, thanks, any reaction or comments? Okay, any other questions or comments from the audience? Okay I think we’re done (audience laughing) So let me go ahead and close today’s event I’ll ask my panelists to just stay with me here up on stage but thank you for the presentation and all of your comments Across today we definitely learned a lot The discussion was very productive We covered a lot of ground and have discussed, in depth, some of the most major areas of the draft 2015 guidance where there are remaining questions on how to implement these approaches and what the best approaches are for doing IND safety reporting in a way that gets reduced but really meaningful information to the FDA In the first session we talked a lot about issues in identifying serious anticipated events It’s not as easy as it seems We had lots of discussion on how to identify in established background rates using randomized control data are best but some ideas of also when you have to use real world data But there are challenges in doing that and one cannot necessarily mirror the clinical trial population and so you may not be able to get the background rates that would be the most appropriate for that clinical trial population But with ongoing work and methods development that could still be an opportunity as a source of data when it’s needed We also discussed the challenges in using passively reported data versus more structured data and many remaining questions on the appropriate use of background rates We talked a lot about category C type events, and when to report, and all of those things So I’m not gonna summarize that whole discussion but moving on to session two we debated, at length, the issues of unblinding while still maintaining trial integrity And the corresponding roles of the SAC, the DMC, the DSMB Plus, all of these things, firewalls Questions about the extent to which unblinding only the SAEs cases was brought up and whether or not doing that actually does challenge the integrity of the trial It was a really important question I’m not sure we got to the answer of that but that’s a remaining question that we have And then turning to the most recent, well I was up here I wasn’t able to summarize those last sessions but you can remember them Session three was on going back to category C type events and what would be the threshold? And I think what we established is there’s no threshold That it’s probably still on a case by case basis but these events should be taken in the totality of evidence and there was some debate on whether, when reporting to the FDA, that should be characterized in terms of the benefit or if this is really just about risk and we needed to look at safety and risk on its own for this particular stage Then we moved on to session four which did talk about the methods from back in the, even before the 80s, it seemed like

but we’re still not getting to good pooling practices and opportunities to increase the education, increase the awareness of these good pooling practices But this all comes back to the overall goal of this, and this guidance, is not about when to stop a trial or when to change things but it’s more simply about when do we know that there’s a risk that’s greater than what was expected? And if it is associated with the particular drug So with that I’d like to go ahead and thank everyone for staying with us throughout the day This was a small group at the beginning but you stuck with us and you were very engaged throughout the conversation so I thank all of you I’d also like to thank all of our panelists and presenters You put forth a very on time (chuckling) but very useful and presentations that were packed full of lots of very useful information I’d like to especially thank our partners at FDA Jacqueline Corrigan-Curay, Diane Perron, and Kahir Alzirod I didn’t look at this ahead of time I’m sorry And especially Jacqueline and Lisa for moderating today That was very helpful to us Lastly I’d like to thank Fatim Aduque, Adam Aten, Morgan Romain, Catherine Frank, Elizabeth Murphy, and Sarah Supreci for all of your efforts of putting together I’d like to thank the folks from the web for tuning in and sticking with us throughout the day Thank you very much for staying with us I appreciate your thoughts and comments and have a great afternoon (audience applauding)