Tip:
Highlight text to annotate it
X
>> Good morning, I'd like first to thank the organizer
for giving me this opportunity to share with all of you the challenges
that we face at RTOG of the bioinformatics technical issues and the projects
that we undertake to try to resolve those issues.
So for those of who is not familiar,
RTOG which stands for Radiation Therapy Oncology Group is the institution
that funded in 1968 and funded through NCI to contact clinical trial
for adult cancer with the objectives to improve the survival outcome
and quality of life, and to evaluate new forms
of radiotherapy delivery techniques and to test new systemic therapies
in conjunction with radiotherapy
and to employ translational research strategies.
Now, the Bioinformatics Working Group is formed within RTOG
to facilitate the development and to develop personalized predictive models
for radiation therapy guidance with specific characteristics information
of patients and treatment with integrated clinical trial databases
to breech clinical science, physics,
biology and information technology and mathematics.
Now, the two major components of the bioinformatics efforts at RTOG
as with all bioinformatics efforts are database
or database integration and data analysis.
For the database, we have rich collections of RT dose,
RT images and clinical data as well as genomic,
proteomic information from biobanks and biomarker information,
and to mine data and perform data analysis from these databases,
we could help protocol development, protocol operation
and to facilitate trial outcome
and secondary analysis in other related research.
So in the following slides now present a number of examples of projects
that are undergoing or we're trying to start its RTOG Bioinformatics Group.
And the first two are for data and data integration.
And you see from this table, we have a vast clinical data from a number
of clinical trials that cover multiple disease sites
which include head, neck, lung, prostate.
And you can see that this data can be used
to model the trauma control probability and to model toxicity
such as the delivery function and late or acute GU/GI toxicities.
And a number of projects were quite successful by Dr. Deasy and Dr. Tucker
that was successfully funded this through NCI.
Now, so this example is the project that RTOG has started
with this collaboration MAASTRO is an institution from Netherlands
where advanced radiotherapy research has been going on.
And Dr. Andre Dekker the head
of MAASTRO Knowledge Engineering spent half a year late last year with RTOG
and we have established this collaboration setting up the system
of rapid learning, computer assisted diagnostics between RTOG and MAASTRO.
So the following slides were used for by Dr. Andre Dekker at the end
of his visit at RTOG to report the progress of this project.
So why do we need rapid learning and computer assisted diagnostics?
That's-- it's because that we wanted to achieve personalized medicine
that improve survival and quality of life.
As you see from this graphic, there is explosion of data with years
that in addition to the general clinical information,
we have structure a genetics information from it for example,
and functional genetics information and proteomics
and other effective molecular information in addition to diagnostic imaging
of functional and anatomical type.
And how can we use this explosion of data to make clinical decision is essential
for us to move forward with personalized medicine.
And one example that was presented with that,
an experiment was conducted that eight radiation oncologist were presented
with 30 patients with non-small-cell lung cancer and been asked
to predict two-year survival from the information
of this patient characteristics.
And you can see that the performance
from these eight radiation oncologists has the area under the curve of 0.57.
We understand that 0.5 is pretty much random prediction
and one being a perfect model
and 0.85 to 0.9 would be the clinically acceptable.
So this is not too far from a random prediction.
And so, now how do we get data for rapid learning?
And the problem is not just technical, rather they are ethical,
political as well as administrative in terms of the time that's required
to get the data together, political one who owns the data and ethical one,
how do we maintain the privacy of patient.
Now the CAT approach is that an IT infrastructure is being developed
to make the radiotherapy centers semantically interoperable that takes care
of administrative issue and the data actually stays within the institution
that takes care of the ethical issue under the full control of the institution
that result to political issue.
So the component of this CAT system are the data exported from CTMS
and PACS system to be converted to ETL to be deidentified and then filtered
into a oncology database, and then the user would just query
and retrieve from such a database
to obtain outcome in standard format of XML or DICOM.
And the application can be shared to analyze this data
or distributed learning algorithm can be performed off this data.
Now the key features of the system is that there is no sharing
of data and truly federated.
And both the community data and clinical trial data can be connected together
and we use the extended NCI oncology library and formal additions
to this library, and we use five languages and five countries
and five legal systems have tested with this system.
And the major focus now is on radiotherapy
and we have a lot help from industry involvement.
And this is the network as it stems so far,
and we're actively talking to Chinese centers
to see either we can extend operation, this CAT system to China as well
as India and other countries.
So one example that's shown here to demonstrate this--
how this system works are that we connect the database from RTOG 0522 the--
to test the model on laryngeal carcinoma that was developed
from MAASTRO group of patients.
And these are the input parameters that went into the modeling and the outcome.
We studied were overall survival.
And so this is just shows how we query the larynx oncology database.
And this is the result that we obtained from our research together
and it's showing the area under the curve plot as well
as the stratified survival curve.
And then, Dr. Andre Dekker went ahead
and tested this distributed learning architecture where the--
instead of sending the data out, the parameters from the model were sent
to the centralized server and we did manipulate it with updated model
to be sent back to the individual model servers.
And this iterative process would produce a final optimized modeling.
So in here you can see that the performance of--
from the distributed learning operation is somewhat better than the model
that we obtained from individual databases.
So there are a number of obstacles to move this forward
so that more institutions can adapt this system.
One of them is that the cost associated with and we're trying to obtain funding
so that we can use all open source components
so that it can be more readily accessible for individual institutions.
Now, the second project that we just started to explore is that we are
under the guidance from NCI to explore the possibility
of clinical trials comparing carbon, proton and photon radiotherapies.
And we have invited Dr. Stephanie Combs
to our RTOG Bioinformatics meeting in June.
And when she presented the database system that is used at the particle center
in Germany that's called ULICE.
So, particle therapy is a very new and promising technique
in radiation therapy for cancer treatment.
Now we have carbon accelerators and protons.
The advantage from particle therapy are that there is more precise dose delivery
to the target thereby offering the advantage of sparing normal tissue
and organs at risk, and also enhance radiobiological effective from carbon ions
so to have the potential to increase local tumor control.
Now how do we demonstrate from the clinical outcome the disadvantages
theoretically that had been explored from radiobiological research
or physics research that we need to organize randomized trials
to establish the clinical advantage of this particle therapies?
Now, Heidelberg Ion-Beam Therapy Center,
they started to treat patient at the end of '09 and the main focus is
to have clinical studies to evaluate the benefits
of ion therapy for several indications.
And the ULICE project is the Union of Light Ions Centers in Europe
and they get together to develop a database with translational access
to perform international clinical multicenter studies and should be accessible
by both external or internal oncology students and researchers.
So they were trying to establish a common database for hadrontherapy
to exchange clinical experience to set up standard and harmonize study
and treatment concept to transfer know-how.
And their paper has just been published in Radiation Oncology the July
which appeared a few days ago.
And their approach is that they establish a centralized web-based system
to have interface to existing information systems of the hospital so that
to avoid redundant entry and to offer study-specific modules
and they have implement security and data protection measures
to fulfill legal requirements.
And their database is rich in SQL database with capability
to be dynamically extended and interfaces to the extended and DICOM and HL7
and with Java applet for manual import of data or receiving
and sending data with DICOM.
And the underlying components are compliant with IHE Framework.
And so they have the capability to exchange store process
and visualize both text data as well as DICOM data.
And this is the diagram of the--
of their structure that you see
that their documentation system can be interfaced with standard hospital
or other information systems, either with HL7 or DICOM standards
and is secured through gateways.
So the security concept that they adopted for HTTPS protocol and they are tiers
of user authority with account name and password,
and the patient data can actually be pseudonymized and depending
on the authority level, the viewer can either view the real name
or the identified information.
So they have been using this system for a few months now
and documented 900 patients.
And this-- they were able to exchange and store various DICOM RT data
to be viewed by DICOM RT ion viewer.
And their huge effort includes extent automatic and alectronic study analyses.
So now that-- this is a very nice system that can perhaps resolve the issue
of the European light ion centers if we start to contact clinical trials
between US and Europe, and perhaps Japan,
would be the optimal data integration method, could it be centralized
or federated, we still need to work on these issues.
Now moving forward, in addition to the clinical data that was used
to check the modeling, what can we do to improve the area end
of the curve performance of this model?
So obviously we need to have a larger database
that contain more patient information or more and more diversified parameters.
Now, there is this simple geometrical information from CT
that one could incorporate into the modeling.
Also, the biomarker information and radiomics, the generic--
genetic information and be combined with biomarkers and the clinical data
to hopefully improve the performance of the modeling
to the clinically acceptable level.
[ Pause ]
So now comes to the second component
of the bioinformatics effort, we come to data analysis.
So the following example, I'll present the data mining that we have undertaken
so that we can go towards evidence based radiation therapy quality assurance.
So, the first example is on clinical target definition.
Now, why is it important to perform radiotherapy quality assurance?
There are two examples that I have included here.
One is from TROG trial 02.02 head and neck trial
where the outcome is actually not governed by the technique
that the clinical trial has started to compare, but it's governed by the quality
of the therapy that was given to the patient.
As you can see from the separation of the survival curve,
the compliant patient had a much better survival curve as compared with patient
who received therapy that were not compliant
with the quality specification from the protocol.
And one of the major violation of the quality is actually target definition
that is missing targets, either from the target definition
or the radiotherapy planning or wrong prescription,
and some of the duration of the treatment were to extend also.
And-- another similar example from RTOG 9705, pancreatic cancer,
and we can see similar performance that there's a separation of the survival not
from the techniques, the clinical trial we set out to compare,
but from the difference in the quality that was given to the patient.
And again, the evaluation of the quality are with target definition
and missing part of the targets in the treatment.
So learning from the past experience,
we have set out to perform this study before we activated RTOG 11 or 6,
the adaptive protocol for treatments of lung cancer.
So we have collected three dry run cases and sent it to about 12 institutions
and asked expert to contour the targets as well as critical structures
for this three lung cancer patients.
And you can see the distribution of these contours from the 12 experts.
And the mean sensitivity is actually 0.81
with a large extended deviation of 0.16.
And this is the variation of OAR and you could see the difference
between the contours of this OAR.
And the consensus contour that's in thick line is plotted
against all the individual contours from around 12 experts
and you could see a pretty substantial spread.
Now what are the impacts of this variation in the contours?
And we have evaluated the tumor control probability using the consensus contour
and we found out that by doing so, the tumor control probability can be reduces
up to 100 percent as compared with what institution submitted
in terms of the dose matrix.
So, that is substantial finding.
Now can we use that to maybe explore the unexpected result from RTOG 0617
when we are trying to compare the outcome with extended RT dose of 60 Gy with 74 Gy?
And from the interim analysis that was presented at last year's Astral,
the high dose had to stopped because of the infertility of continuing
with the trial that we have not demonstrated in advantage with 74 Gy
and we will not be able to with the rest of the accrual.
So could our prior investigations point to one of the possible reasons
to explain this unexpected outcome,
that is one of the projects that is currently undertaken
by the scientist at RTOG.
Now, we have also used the data that we have collected
for clinical trial quality assurance for image guided radiotherapy--
evidenced based quality assurance criteria establishment.
So for image guided radiation therapy credentialing,
we asked institutions to submit DICOM data as well
as the shift information along with this DICOM data.
And there a number of steps that we have established current quality
assurance criteria.
First, we start out to evaluate the different performance from multiple systems
and obtain the uncertainty that is associated
with different imagery registration systems and that was incorporated
in the passing criteria that we used to review the IGRT credentialing.
And then we set out to credential IGRT for a number of disease sites along
by the neck and reported the outcome from this IGRT credentialing and-- from--
and we published its result.
And from this investigation, we have found out what is the most impactful item
of the IGRT and we have adopted our credentialing process accordingly.
[ Pause ]
And the following two examples are for evidence based quality assurance
of radiotherapy planning, especially for intensity modulated radiotherapy.
In one group from Duke, Jackie Wu and Yaorong Ge,
they were invited to present their research
at the January Bioinformatics Working Group meeting.
They took head and neck IMRT and used the anatomical and physiological factors
and quantify their individual influence
in mathematical modeling and machine learning.
And the code treatment planning
and experience guidelines using knowledge engineering
and they established a model to use these factors
to offer peer review type of guidance.
Now, this is their-- the example of their result.
The red lines are from their modeled upper and lower level of DVH,
and the blue line is the actual DVH they obtained from the clinic.
And you could see that the performance is relatively good.
And moving forward, we hope to strengthen the collaboration with them
so that they can test their model on the bigger RTOG database.
And also, they plan to use all types of knowledge sources to incorporate
into their predictive models
and to use the extended ontology framework to with decision support.
And one example is that they wanted to incorporate the contact QUANTEC guideline
in their decision making support.
And another similar project by Kevin Moore who was invited to--
presented at June RTOG Bioinformatics meeting,
they assume a similar approach that they have identified need
that IMRT plans are not always optimal and how can we predict what kind
of quality given new patient characteristic.
And what they did is that they modeled the geometrical shape of a target
in critical structure and modeled this premises
so that they can actually predict the DVH outcome.
And this graph shows that the red lines are clinical approved DVH,
and these blue lines are average *** model,
and the black lines are from the refined *** model.
And you see it is close resemblance of the model performance
to the clinically approved DVH.
So this prediction for a new patient.
And now, going forward again, we intend to establish the collaboration
with these group of researchers to test out the model with RTOG database
and perhaps we could use the results from this investigation to help us
with plan quality assurance for clinical trials in future endeavors.
[ Pause ]
So now with the database and database integration, it--
we would be in great need of analytical method for us
to extract information from this data.
So, a group of researchers that's associated
with RTOG had undertaken the project to study the algorithm
that can resolve some of the challenges that we face in the data analysis.
So one of the challenges that we face is that there is tremendous uncertainty
that is associated with all the data that we are analyzing.
So, with the conventional frequenistic inference method,
for example with maximum likelihood estimation confidence intervals
and P-values, the uncertainty is not taking into account in a genetic manner.
However, with bayesian influence method,
we have worked together with mathematician that we introduce the concept
of belief and possibility and also,
we used Dempster-Shafer theory when we can actually take
into account the uncertainty within the analysis process.
And this research has just been accepted by Physics in Medicine and Biology.
And-- so the belief and plausibility prediction is plotted
against the uncertainty range of the data points for radiation pneumonitis data
that we extracted from the contact publication.
And we test this against the conventional NTCP model and it shows that the--
our result is very much in line with the convention NTCP parameters such as TD50
and MS. So we hope to use this to more data and databases to offer a new way
of visualizing the data to make clinical decisions.
Now, future directions, and we just saw the new funding opportunity announcement
early in the week that we are going to regroup into new cognitive centers
as well as consolidating the quality assurance
for radiation therapy and imaging.
Now we call the IROC group.
How do we consolidate and integrate all the data together along
with tissue bank statistics data is a challenge that all of us are facing
and it's going to be an exciting period for the next number of months and years.
Thank you for your attention.