Protein design using answer set programming

Abstract

Different proteins have different functions determined by its structure. Proteins are a sequence of amino acids folded into a three-dimensional structure. Each amino acid is formed by an amine group, a carboxyl group and a side-chain. The side-chain is specific for each amino acid and each amino acid can have numerous possible side-chain conformations, called rotamers. A protein has an energy associated and this energy depends on the amino acids and side-chain that form the protein. Predicting the set of amino acids and respective side-chain that minimizes the total energy of a protein is therefore a very important problem, called protein design. In this work we develop a program to solve the protein design problem using Answer Set Programming. Answer Set Programming (ASP) is an approach to declarative solving problems. This work describes the ASP program implemented to solve protein design problems, using the ASP grounder gringo and the ASP solver clasp to search for the answer sets of the program. Two approaches of the protein design problem were considered: one considered that the amino acids of the protein to design are kept fixed; the other considered that the amino acids are not kept fixed and therefore the amino acids and respective side-chains must be determined. In this work were made two implementations in ASP for the protein design problem. One is a simple codification and the other uses a multi-criteria optimization. Moreover, there were implemented three algorithms of dead-end elimination (DEE): Original DEE; Simple Goldstein; and Simple Split.

Type
Filipe Gouveia
Filipe Gouveia
Computer Science Researcher

My research interests include artificial intelligence, computational logic and automated reasoning.