Patrick Foley

Patrick Foley

Portland, Oregon, United States
486 followers 469 connections

Activity

Join now to see all activity

Experience

Education

Publications

  • Federated learning enables big data for rare cancer boundary detection

    Nature Communications

    Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an…

    Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing.

    See publication
  • OpenFL: the open federated learning library

    Physics in Medicine & Biology

    Objective: Federated learning (FL) is a computational paradigm that enables organizations to collaborate on machine learning (ML) and deep learning (DL) projects without sharing sensitive data, such as patient records, financial data, or classified secrets. Approach: Open Federated Learning (OpenFL) framework is an open-source python-based tool for training ML/DL algorithms using the data-private collaborative learning paradigm of FL, irrespective to the use case. OpenFL works with training…

    Objective: Federated learning (FL) is a computational paradigm that enables organizations to collaborate on machine learning (ML) and deep learning (DL) projects without sharing sensitive data, such as patient records, financial data, or classified secrets. Approach: Open Federated Learning (OpenFL) framework is an open-source python-based tool for training ML/DL algorithms using the data-private collaborative learning paradigm of FL, irrespective to the use case. OpenFL works with training pipelines built with both TensorFlow and PyTorch, and can be easily extended to other ML and DL frameworks. Main Results: In this manuscript, we present OpenFL and summarize its motivation and development characteristics, with the intention of facilitating its application to existing ML/DL model training in a production environment. We further provide recommendations to secure a federation using trusted execution environments to ensure explicit model security and integrity, as well as maintain data confidentiality. Finally, we describe the first real-world healthcare federations that use the OpenFL library, and highlight how it can be applied to other non-healthcare use cases. Significance: The OpenFL library is designed for real world scalability, trusted execution, and also prioritizes easy migration of centralized ML models into a federated training pipeline. Although OpenFL's initial use case was in healthcare, it is applicable beyond this domain and is now reaching wider adoption both in research and production settings. The tool is open sourced at github.com/intel/openfl.

    See publication
  • The federated tumor segmentation (FeTS) tool: an open-source solution to further solid tumor research

    Physics in Medicine & Biology

    Objective: De-centralized data analysis becomes an increasingly preferred option in the healthcare domain, as it alleviates the need for sharing primary patient data across collaborating institutions. This highlights the need for consistent harmonized data curation, pre-processing, and identification of regions of interest based on uniform criteria. Approach: Towards this end, this manuscript describes the Federated Tumor Segmentation (FeTS) tool, in terms of software architecture and…

    Objective: De-centralized data analysis becomes an increasingly preferred option in the healthcare domain, as it alleviates the need for sharing primary patient data across collaborating institutions. This highlights the need for consistent harmonized data curation, pre-processing, and identification of regions of interest based on uniform criteria. Approach: Towards this end, this manuscript describes the Federated Tumor Segmentation (FeTS) tool, in terms of software architecture and functionality. Main Results: The primary aim of the FeTS tool is to facilitate this harmonized processing and the generation of gold standard reference labels for tumor sub-compartments on brain magnetic resonance imaging, and further enable federated training of a tumor sub-compartment delineation model across numerous sites distributed across the globe, without the need to share patient data. Significance: Building upon existing open-source tools such as the Insight Toolkit (ITK) and Qt, the FeTS tool is designed to enable training deep learning models targeting tumor delineation in either centralized or federated settings. The target audience of the FeTS tool is primarily the computational researcher interested in developing federated learning models, and interested in joining a global federation towards this effort. The tool is open sourced at https://github.com/FETS-AI/Front-End.

    See publication
  • OpenFL: An open-source framework for Federated Learning

    Federated learning (FL) is a computational paradigm that enables organizations to collaborate on machine learning (ML) projects without sharing sensitive data, such as, patient records, financial data, or classified secrets. Open Federated Learning (OpenFL this https URL) is an open-source framework for training ML algorithms using the data-private collaborative learning paradigm of FL. OpenFL works with training pipelines built with both TensorFlow and PyTorch, and can be easily extended to…

    Federated learning (FL) is a computational paradigm that enables organizations to collaborate on machine learning (ML) projects without sharing sensitive data, such as, patient records, financial data, or classified secrets. Open Federated Learning (OpenFL this https URL) is an open-source framework for training ML algorithms using the data-private collaborative learning paradigm of FL. OpenFL works with training pipelines built with both TensorFlow and PyTorch, and can be easily extended to other ML and deep learning frameworks. Here, we summarize the motivation and development characteristics of OpenFL, with the intention of facilitating its application to existing ML model training in a production environment. Finally, we describe the first use of the OpenFL framework to train consensus ML models in a consortium of international healthcare organizations, as well as how it facilitates the first computational competition on FL.

    See publication
  • Accelerate Genomics Research with the Broad Intel Genomics Stack

    Intel

    Intel and the Broad Institute of MIT and Harvard are at the forefront of the effort to accelerate genomics analysis and the benefts it can produce. Together, Intel and Broad have introduced an integrated hardware and software solution to run Broad’s popular Genome Analysis Toolkit* (GATK*) faster, at unprecedented scale, and with easier deployment. It used to take six weeks to generate a database from 2,300 genomes. Now, using the Broad-Intel Genomics Stack* (BIGstack*), a database containing 5x…

    Intel and the Broad Institute of MIT and Harvard are at the forefront of the effort to accelerate genomics analysis and the benefts it can produce. Together, Intel and Broad have introduced an integrated hardware and software solution to run Broad’s popular Genome Analysis Toolkit* (GATK*) faster, at unprecedented scale, and with easier deployment. It used to take six weeks to generate a database from 2,300 genomes. Now, using the Broad-Intel Genomics Stack* (BIGstack*), a database containing 5x more information can be generated in only two weeks.

    BIGstack is a game-changing, end-to-end integrated hardware and software package. With common, validated reference designs that use the latestgeneration Intel® Xeon® Scalable processors, Intel® Arria® 10 Field Programmable Gate Array (FPGA) PCIe* cards, Intel® Omni-Path Architecture (Intel® OPA), and Intel® 3D NAND Solid State Drives (Intel® SSDs), BIGstack can help ease the complexity of running the genomics analysis pipeline (specifcally, Broad Institute’s production-worthy Best Practices workflows) while dramatically speeding up the analysis process.

    This paper demonstrates how a BIGstack-based platform that uses the latest Intel Xeon Scalable processors and Intel 3D NAND SSDs achieves a throughput of up to 5 whole genomes and more than 100 whole exomes per day per node. Intel® FPGA technology further speeds up the individual sample analysis by up to 2.2x for whole genomes at a lower memory cost compared to prior-generation Intel® Xeon® processor for Broad’s GATK Best Practices. Information is provided about tools, technologies, optimizations, and methodology, as well as details about latency, throughput, and utilization of CPU, memory, and disk resources.

    See publication
  • Primer for Image Informatics in Personalized Medicine

    Procedia Engineering

    Image informatics encompasses the concept of extracting and quantifying information contained in image data. Scenes, what an image contains, come from many imager devices such as consumer electronics, medical imaging systems, 3D laser scanners, microscopes, or satellites. There is a marked increase in image informatics applications as there have been simultaneous advances in imaging platforms, data availability due to social media, and big data analytics. An area ready to take advantage of…

    Image informatics encompasses the concept of extracting and quantifying information contained in image data. Scenes, what an image contains, come from many imager devices such as consumer electronics, medical imaging systems, 3D laser scanners, microscopes, or satellites. There is a marked increase in image informatics applications as there have been simultaneous advances in imaging platforms, data availability due to social media, and big data analytics. An area ready to take advantage of these developments is personalized medicine, the concept where the goal is tailor healthcare to the individual. Patient health data is computationally profiled against a large of pool of feature-rich data from other patients to ideally optimize how a physician chooses care. One of the daunting challenges is how to effectively utilize medical image data in personalized medicine. Reliable data analytics products require as much automation as possible, which is a difficulty for data like histopathology and radiology images because we require highly trained expert physicians to interpret the information. This review targets biomedical scientists interested in getting started on tackling image analytics. We present high level discussions of sample preparation and image acquisition; data formats; storage and databases; image processing; computer vision and machine learning; and visualization and interactive programming. Examples will be covered using existing open-source software tools such as ImageJ, CellProfiler, and IPython Notebook. We discuss how difficult real-world challenges faced by image informatics and personalized medicine are being tackled with open-source biomedical data and software.

    See publication

Courses

  • AI in Robotics

    -

  • Computability and Algorithms

    -

  • Computer Networks

    -

  • Computer Vision

    -

  • Data and Visual Analytics

    -

  • High Performance Computer Architecture

    -

  • Intro to Health Informatics

    -

  • Machine Learning

    -

  • Machine Learning for Trading

    -

  • Reinforcement Learning

    -

More activity by Patrick

View Patrick’s full profile

  • See who you know in common
  • Get introduced
  • Contact Patrick directly
Join to view full profile

People also viewed

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Patrick Foley in United States

Add new skills with these courses