Skip Navigation and Go To Content

Bioinformatics & High Performance Computing Service Center

Center Overview

The UTHealth Houston Bioinformatics and High-Performance Service Center is part of the Center for Clinical and Translational Sciences that offers extensive expertise to meet the bioinformatics and high-performance computing needs of UTHealth Houston, Texas Medical Center and beyond. Leveraging advanced data science knowledge and robust computing infrastructure, the center also empowers investigators by integrating cutting-edge artificial intelligence methodologies into their projects. Below are key areas of support, along with related publications and grant achievements:

  • High performance computing for bioinformatics, data science and AI

    Our computing infrastructure is designed to support advanced research needs, featuring GPU servers for training AI models, large-memory servers for high-throughput data analysis at scale, a Hadoop cluster with 36 computing nodes for hosting large databases and performing rapid data analysis, and over 3 petabytes of storage for managing vast datasets. A list of advanced computing hardware is listed below:

     

    1. The world's first Nvidia DGX-H100 GPU server that is the fastest and the most complete AI platform for Enterprise AI. The DGX-H100 system has 8 Nvidia H100 tensor core GPUs with 640GB GPU memory, 32 petaFLOPS FP8, 4x Nvidia NVSwitch, and 2TB systems memory. 
    2. One Dell PE R750 with 5TB system memory, two Intel Xeon Platinum 8380 with 80 Cores/160 Threads, 180TB system storage and 2 Nvidia H100 GPU Cards. 
    3. One Nvidia DGX-A100 GPU servers with 5 petaFLOPS AI/10 petaOPS INT8 performance, 8x latest NVIDIA A100 GPUs, 12 NVLinks/GPUs (600GB/s GPU to GPU communication), 320 GB total GPU memory, NVIDIA CUDA Cores: 65536, NVIDIA Tensor Cores: 4096, 1 TB DDR4 Memory and 30 TB SSD storage. 
    4. One EXXACT TS4 deep learning certified system with 24 CPU cores, 10 Nvidia RTX 2080 Ti GPUs (43,520 cores), 376 GB memory, and 12 TB internal storage. 
    5. One Supermicro 8-way systems with 448 CPU cores, 1 Xilinx Alveo U250 FPGA, 6TB system memory, 60TB local SSD drive. It also has a Nvidia V100 GPU (20,480 cores) and an additional 200TB of external storage. 
    6. 3 PB of storage space provided by various direct-attached storage devices, NFS server and Dell ME5084 Storage Array. 
    7. One 36 computing-node Dell EMC Hadoop Cluster with 864 total computing cores, 12.8 TB combined memory, and 1.5 PB (raw) storage.
    8.  

    Advance Hardware Infrastructure for AI and Big Data

    In addition, the Texas Advanced Computing Center (TACC, https://www.tacc.utexas.edu) is available through a high-speed network (Internet II). TACC is equipped with many robust, high-performance computing systems, including Frontera—the fifth most powerful supercomputer in the world (2019). TACC's ultimate science environment includes high-performance computing, visualization, data analysis, storage systems, software, and portal interfaces that enable researchers to answer questions more efficiently and effectively using advanced computing resources. TACC provides systems and software to researchers and has worked on over 3000 projects by more than 1000 researchers at over 350 institutions nationally and worldwide that address scientific concepts to improve the quality of life. 

    All these systems are connected through a multi-platform computer network (1GBS, 10 GBS, and 25GBS) that is in a continuous process of upgrading to state-of-the-art technology.

    Our service center has extensive experience in developing high-performance computing applications to facilitate large-scale data analysis, as demonstrated in the publication below:

    1. Shikun Wang*, Zhao Li*, Lan Lan, Jieyi Zhao, Jim Zheng#, Liang Li#: GPU accelerated estimation of a shared random effect joint model for dynamic prediction, Computational Statistics & Data Analysis, Volume 174, 2022, 107528, ISSN 0167-9473, https://doi.org/10.1016/j.csda.2022.107528. (*Equal contribution, #Corresponding authors)
  • Advanced protein structure prediction by AlphaFold, RFDiffusion and more

    A recent breakthrough in artificial intelligence is the development of AlphaFold, which enables computational prediction of protein structures from primary sequences. Leveraging our advanced computing infrastructure and expertise in structure-function analysis, we have predicted over 10,000 protein structures, including detailed analyses of how alternative splicing impacts protein structure. With the latest AlphaFold 3 and RFDiffusion, we can perform high-throughput structure predictions for novel protein sequences. Additionally, we collaborate with investigators to conduct further functional and structural analyses based on the predicted models.

    1. Yuntao Yang, Himansu Kumar, Yuhan Xie, Zhao Li, Rongbin Li, Wenbo Chen, Chiamaka S Diala, Meer A Ali, Yi Xu, Albon Wu, Sayed-Rzgar Hosseini, Erfei Bi, Hongyu Zhao, Pora Kim#, W Jim Zheng#: ASpdb: an integrative knowledgebase of human protein isoforms from experimental and AI-predicted structures, Nucleic Acids Research, 53(D1):D331-339, 2025, https://doi.org/10.1093/nar/gkae1018. PMID: 39530217. (#Corresponding authors)
  • Analyze and annotate biological data with AI

    Our team has extensive experience in genomic data analysis and has developed award-winning AI methods for literature mining and data annotation, including genes, drugs, and diseases. These advanced methods are particularly valuable for analyzing novel genes and their functions, especially in cases with limited information or annotations where traditional approaches, such as Gene Set Enrichment Analysis, fall short.

    1. Zhao Li, Qiang Wei, Liang-Chin Huang, Jianfu Li, Yan Hu, Yao-Shun Chuang, Jianping He, Avisha Das, Vipina Kuttichi Keloth, Yuntao Yang, Chiamaka S Diala, Kirk E Roberts, Cui Tao, Xiaoqian Jiang, Jim Zheng#, Hua Xu#: Ensemble pretrained language models to extract biomedical knowledge from literature, Journal of the American Medical Informatics Association, 2024 31(9):1904-1911. doi: 10.1093/jamia/ocae061. Epub ahead of print. PMID: 38520725. (#Corresponding authors)
    2. Avisha Das*, Zhao Li*, Qiang Wei*, Jianfu Li*, Liang-chin Huang*, Yan Hu, Rongbin Li, Jim Zheng#, & Hua Xu#: Extracting Drug-Protein Relation from Literature using Ensembles of Biomedical Transformers, Studies in Health Technology and Informatics310:639-643, 2024, (The method that won 2nd place in the 2021 Large Scale Track, DrugProt BioCreative VII Track competition and accepted by MedInfo2023), (*Equal contribution, #Corresponding authors).
  • High throughput, large scale data analysis and mining

    Our team has extensive experience in high-throughput, large-scale data analysis and mining across a diverse range of biological data types, as demonstrated in some of our recent publications.

    1. Teresa T. Nguyen, Dong Ho Shin, Sagar Sohoni, Yisel Rivera-Molina, Hong Jiang, Xuejun Fan, Sanjay K. Singh, Joy Gumin, Frederick F. Lang, Marta M. Alonso, Lisha Zhu, Jim Zheng, Lijie Zhai, Erik Ladomersky, Kristen L. Lauing, Derek A. Wainwright, Candelaria Gomez-Manzano, Juan Fueyo: Reshaping the tumor microenvironment with oncolytic viruses, positive regulation of the immune synapse, and blockade of the immunosuppressive oncometabolic circuitry, Journal for ImmunoTherapy of Cancer, 10(7):e004935 2022, PMID: 35902132, PMCID:PMC9341188.
    2. Michihiro Kobayashi, Haichao Wei, Takashi Yamanashi, Nathalia Azevedo Portilho, Samuel Cornelius, Noemi Valiente, Chika Nishida, Haizi Cheng, Augusto Latorre, Jim Zheng, Joonsoo Kang, Jun Seita, David J Shih, Jia Qian Wu, Momoko Yoshimoto: HSC-independent definitive hematopoiesis persists into adult life, Cell Reports, 2023 Mar 28;42(3):112239. doi: 10.1016/j.celrep.2023.112239. Epub 2023 Mar 11. PMID: 36906851; PMCID: PMC10122268.
    3. Chandra Sekhar Amara, Karthik Reddy Kami Reddy, Yang Yuntao, Yuen San Chan, Danthasinghe Waduge Badrajee Piyarathna, Lacey Elizabeth Dobrolecki, David J H Shih, Zhongcheng Shi, Jun Xu, Shixia Huang, Matthew J Ellis, Andrea B Apolo, Leomar Y Ballester, Jianjun Gao, Donna E Hansel, Yair Lotan, H. Courtney Hodges, Seth P Lerner, Chad J Creighton, Arun Sreekumar, Jim Zheng, Pavlos Msaouel, Shyam M Kavuri and Nagireddy Putluri: The IL6/JAK/STAT3 signaling axis is a therapeutic vulnerability in SMARCB1-deficient bladder cancer, Nature Communication, 2024 Feb 14;15(1):1373. doi: 10.1038/s41467-024-45132-2. PMID: 38355560; PMCID: PMC10867091.
    4. Xiaohua Ye, David J. H. Shih, Zhiqiang Ku, Junping Hong, Diane F. Barrett, Richard E. Rupp, Ningyan Zhang, Tong-Ming Fu, Jim Zheng#, Zhiqiang An#: Transcriptional signature of durable effector T cells elicited by a replication defective HCMV vaccine, npj Vaccines, 2024 Apr 1;9(1):70. doi: 10.1038/s41541-024-00860-w. PMID: 38561339; PMCID: PMC10984989. (#Corresponding authors)
  • Grant development

    We actively collaborate with UTHealth faculty to support grant development, achieving significant success. Our services offer comprehensive grant development support, including initial brainstorming of research ideas, proposal writing, and preliminary data generation. Below are examples of recent grant awards we have supported:

      Funding Source Grant ID Total Award Project start/end
    2020
    NIH KL2 TR003168 $330,800.00 7/1/20 6/30/22
    NIH 1K01AI148593-01A1 $481,000.00 7/1/20 6/30/24
    2021
    ANRF Arthritis Nat’l Res. Fdn $100,000.00 9/1/20 8/31/21
    CPRIT RP210045 $3,998,553.00 6/1/21 5/31/26
    2022
    CPRIT RP220244 $1,397,258.00 4/1/22 3/31/26
    Roderick D. MacDonald Baylor St. Luke’s Award 66727-1 $50,000 5/1/22 4/30/23
    NIH K08AR081402-01 $820,800 8/10/22 7/31/27
    2023
    NIH 5R21DE031440 $454,028 9/19/22 9/18/24
    NIH 1R01AR081280-01A1 $1,124,325 5/1/23 4/30/28
    2024
    NIH 1K23AR083506 $812,700 2/12/24 1/31/29
      NIH 1UM1TR004906-01 $55,500,000 7/24/2024 6/30/2029

In addition to the areas highlighted above, we offer a wide range of traditional bioinformatics data analysis services, including (but not limited to) sequence analysis, gene expression analysis (microarray, bulk sequencing, single-cell RNA-Seq, spatial transcriptomics, etc.), genotyping, proteomics, metabolomics, and data mining. We also provide customized and complex analyses tailored to meet your specific needs. Our service rate is shown below:

Bioinformatics Service Hourly Rate:
UTHSC/MDACC Other (Non-Profit) Other (For Profit)
Short-Term Projects
Gene Annotation $125 $200 $250
Protein Structure Prediction $125 $200 Contact us
Microarray Analysis $125 $200 $250
Metabonomic Data Analysis $125 $200 $250
Proteomics Analysis $125 $200 $250
Genotyping Analysis $125 $200 $250
Next-Gen Sequencing Analysis $125 $200 $250
Custom Data Analysis $125 $200 $250
HPC Service (not the computing time) $125 $200 $250
Long-Term Projects
Complex or Long-Term Projects Contact us Contact us Contact us
Consultations
Initial Consultation no charge no charge no charge
Collaborative Grant Proposals Contact us Contact us Contact us