EGI Federation Home
Health and Medicine

Precision in Proteins: Predicting pKa with PypKa, a Poisson–Boltzmann-based Tool

PypKa, a Poisson–Boltzmann-based pKa predictor for proteins using 3D structures as input, is a tool developed by the Machuqueiro Lab at the University of Lisbon, Portugal.

PypKa

About

PypKa is a tool developed by the Machuqueiro Lab at the University of Lisbon, Portugal. It’s a Poisson–Boltzmann-based pKa predictor for proteins using 3D structures as input. The tool also predicts isoelectric points and can process pdb structures to assign the correct protonation states to all residues. The PypKa web server is a user-friendly tool that streamlines pKa calculations and the preparation of biomolecular structures, complementing the available CLI and API. In this web server, a large database of >10M pKa values and 120k isoelectric points (pKPDB) can also be accessed and downloaded.

BioISI

About

PypKa is free to use and does not require registration. It serves a large community working on protein dynamics and electrostatics. The isoelectric point calculations are also appealing to many experimentalists. The pKPDB database is freely available and has already been used to generate AI-based methods to speed up the predicted ability of PypKa.

Characteristics

The aim is to establish an easy-to-use cloud service that allows for fast pKa and isoelectric point calculations using user-provided protein structures or those obtained from the Protein Data Bank. The primary goal is to make PypKa the go-to solution for these calculations, building upon its high accuracy and computational speed. Additionally, we aim to create a large dataset of pKa values and isoelectric points, which will be pivotal in training machine-learning algorithms. PypKa is a Python module to predict Poisson-Boltzmann-based pKa values of biomolecules. This is a free and open source project that provides a simple, reusable and extensible Python API and CLI for pKa calculations with a valuable trade-off between fast and accurate predictions. With PypKa, one can enable pKa calculations, including optional proton tautomerism, within existing protocols by adding a few extra lines of code. PypKa supports CPU parallel computing on anisotropic (membrane) and isotropic (protein) systems and allows the user to find a balance between accuracy and speed.

Computing needs

Considering that the performance of the PypKa cloud service is scaling almost linearly concerning the number of CPU cores, to support this challenge, the PypKa cloud server was deployed on the cloud resources of the EGI cloud infrastructure. Specifically, a dedicated virtual cluster, whose resources scale dynamically, taking into account the number of users’ requests to be served, was configured using the Infrastructure Manager (IM) solution developed by the Grid and High-Performance Computing Group (GRyCAP) at the Instituto de Instrumentación para Imagen Molecular (I3M) from the Universitat Politècnica de València (UPV).

The Solution

  • The EGI Cloud Compute and the cloud-based EGI Online Storage to allocate the resources for the web server application.
  • The EGI Check-In to enable users’ registration and authentication mechanisms.
  • Technical consultancy to profit from EGI solutions.

Additionally, the Infrastructure Manager (IM) was used to deploy the virtual and elastic cluster for the PypKa cloud service on top of the EGI Federated Cloud infrastructure.

Services Provided by EGI

Store, share and access your files and their metadata on a global scale

Login with your own credentials

Run virtual machines on-demand with complete control over computing resources

The IM Dashboard is a graphical interface for the IM Server specially developed for EOSC users to access EGI Cloud Compute resources.

Dedicated computing and storage for training and education

Useful Resources