This article proposes a phase I/II clinical trial design for adaptively and dynamically optimizing each patient’s dose in each of two cycles of therapy based on the joint binary efficacy and toxicity outcomes in each cycle. A dose-outcome model is assumed that includes a Bayesian hierarchical latent variable structure to induce association among the outcomes and also facilitate posterior computation. Doses are chosen in each cycle based on posteriors of a model-based objective function, similar to a reinforcement learning or Q-learning function, defined in terms of numerical utilities of the joint outcomes in each cycle. For each patient, the procedure outputs a sequence of two actions, one for each cycle, with each action being the decision to either treat the patient at a chosen dose or not to treat. The cycle 2 action depends on the individual patient’s cycle 1 dose and outcomes. In addition, decisions are based on posterior inference using other patients’ data, and therefore, the proposed method is adaptive both within and between patients. A simulation study of the method is presented, including comparison to two-cycle extensions of the conventional 3 + 3 algorithm, continual reassessment method, and a Bayesian model-based design, and evaluation of robustness. Supplementary materials for this article are available online
© 2008-2024 Fundación Dialnet · Todos los derechos reservados