2020
Authors
Fernandes, D; Silva, C; Dutra, I;
Publication
ACM Crossroads
Abstract
2020
Authors
Carrera, I; Dutra, I; Tejera, E;
Publication
2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE
Abstract
One important problem in Bioinformatics is the discovery of new interactions between cellular lines and chemical compounds. In silico methods for cell-line screening are fundamental to optimize cost and time in the drug discovery processes. In order to build these methods, we need to computationally represent cell lines. Current methods for modeling cell line interactions rely on comparing genetic expression profiles. However, these profiles are usually unknown. In this work, we present a method to characterize and represent cell lines by text processing the related scientific literature. We collect abstracts of scientific papers about cellular lines from Cellosaurus and PubMed. These documents are then represented as TF-IDF vectors. We build a data set for classification with the document vectors having the cell line identifier as the target class. We then apply a multiclass SVM classification method. We use Support Vector Domain Description to describe and characterize each cell line with its corresponding hyperplane obtained with a one-vs-rest training. We evaluated several configurations of classifiers, using micro-averaged precision as metric to choose the best classifier, and were able to differentiate cellular lines from a set of 200+.
2021
Authors
Carrera, I; Tejera, E; Dutra, I;
Publication
Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2021, Volume 5: HEALTHINF, Online Streaming, February 11-13, 2021.
Abstract
The discovery of new biological interactions, such as interactions between drugs and cell lines, can improve the way drugs are developed. Recently, there has been important interest for predicting interactions between drugs and targets using recommender systems; and more specifically, using recommender systems to predict drug activity on cellular lines. In this work, we present a simple and straightforward approach for the discovery of interactions between drugs and cellular lines using collaborative filtering. We represent cellular lines by their drug affinity profile, and correspondingly, represent drugs by their cell line affinity profile in a single interaction matrix. Using simple matrix factorization, we predicted previously unknown values, minimizing the regularized squared error. We build a comprehensive dataset with information from the ChEMBL database. Our dataset comprises 300,000+ molecules, 1,200+ cellular lines, and 3,000,000+ reported activities. We have been able to successfully predict drug activity, and evaluate the performance of our model via utility, achieving an Area Under ROC Curve (AUROC) of near 0.9. Copyright
2019
Authors
Fernandes, D; Dutra, I;
Publication
ACM Crossroads
Abstract
2019
Authors
Pinheira, A; Silva Dias, Rd; Nascimento, C; Dutra, I;
Publication
Computational Intelligence Methods for Bioinformatics and Biostatistics - 16th International Meeting, CIBB 2019, Bergamo, Italy, September 4-6, 2019, Revised Selected Papers
Abstract
Bipolar Disorder (BD) is chronic and severe psychiatric illness presenting with mood alterations, including manic, hypomanic and depressive episodes. Due to the high clinical heterogeneity and lack of biological validation, both BD treatment and diagnostic are still problematic. Patients and clinicians would benefit from better clinical and biological characterization, ultimately opening a new possibility to distinct forms of treatment. In this context, we studied genome wide association (GWA) data from the Wellcome Trust Case Control Consortium (WTCCC). After an exploratory analysis, we found a higher prevalence of homozygous compared with heterozygous in different single nucleotide polymorphisms (SNPs) in genes previously associated with BD risk. Results from our association rules analysis indicate that there is a group of patients presenting with different groups of genotypes, including pairs or triples, while others present only one. We performed the same analysis with a control group from the same cohort (WTCCC) and found that although healthy subjects may present the same SNPs combinations, the risky alleles occur in a lower frequency. Moreover, no subject in the control group presented the same pairs or triples of genotypes found in the BD group, and if a pair or triple is found, the support and confidence are lower than in the BD group (< 50 %). © Springer Nature Switzerland AG 2020.
2019
Authors
Fernandes, D; Silva, C; Dutra, I;
Publication
ACM Crossroads
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.