

An In-Depth Analysis of Neuropilin 1
Nicole Dano and Jessica Hrnciar
Date Published: December, 14 2020
Tools
Databases Used: ​
​
-
BLAST (Basic Local Alignment Search Tool) is a tool that is able to compare amino acid or nucleic acid sequences to a database of known sequences
-
NCBI gives access to databases that can be used to find a wide variety of information regarding general information about a gene
-
RefSeq is a reference sequence database to provide information on genomic, transcript and protein sequences and information
-
AceView is an extension of the NCBI database helpful for mapping, gathering information about protein isoforms, and gene associated diseases (8)
-
EPD (Eukaryotic Promoter Database) provides access to multiple databases that are helpful when determining promoter regions (9)
-
Ensembl is a database used to view a gene on a genomic level, and it provides information on transcript variants, genomics, and expression (10)
-
Chimera is a 3D interactive protein structure viewer with tools allowing certain sequences to be highlighted (11)
-
RCSB PDB contains 3D structures of proteins and information and the sequence and general protein information (12)
-
UniProt provides free access to information about protein sequences and functions (14)
-
Prosite is a database of domains and other protein identifying structures (17)
-
HPRD (Human Protein Reference Database) is a database that provides basic protein information such as domain information and expression (36)
-
GeneCards is a database that provides thorough information on genes (38)
-
Bgee is a database that contains information on gene expression of many different organisms at different stages of life (37)
-
OMIM (Online Mendelian Inheritance in Man) provides information on the mutations and diseases related to genes (40)
-
ClinVar provides reports comparing relationships between human genetic variation and health (39)
-
dbSNP is a database with information about single nucleotide polymorphisms and variants and other mutations involved in a protein (41)
-
PubMed Central is a collection of countless life science articles and studies
​
​
​​
Methods
-
The gene was found by inputting the amino acid sequence provided into BLAST.
-
Under the same NCBI page, the general gene information was provided under the summary category, allowing our gene to be properly identified.
-
Other information such as the gene table and pre-mRNA isoforms was found using AceView and EPD. General information about our gene, such as name and symbols, was identified through NCBI. Isoform information and expression information were found using AceView.
-
The protein associated with our gene was found when using the sidebar links NCBI provided, and NCBI also included information on the common names of our gene.
-
The tables and diagrams we created are a combination of several different databases. AceView provided the most information and was the primary source used to create the tables and gene diagrams, it also provided information on isoforms and expression. EPD was used to identify the location of promoter sequence features. Ensembl was another useful tool that allowed us to identify developmental differential gene expression.
-
The protein structure was identified, and this helped us infer and learn about the protein function since structure and function is intimately related. Through a link on our NCBI protein page we were able to identify several structures and their PDB ID. After choosing the PDB structure we wanted to focus on we were able to visualize it through Chimera. By using several research articles we were able to better understand the structure of our protein and how it often forms complexes with other proteins. Chimera also allowed us to highlight specific domains on our protein. UniProt and the NCBI conserved domains link allowed us to identify the secondary structure of our protein and the binding sites of our protein. Prosite, AceView, and UniProt were the databases we used to identify the specific features of the domains found in our protein.
-
Next, our protein function was further studied. UniProt and AceView listed many functions of our proteins, but we mainly relied on research articles to understand the specifics. To find the biological pathways that our protein is involved in we used UniProt and AceView again to find pathways our proteins are involved in, but we heavily relied on research articles to understand our proteins role in the specific pathway and the overall function of the pathway. The sub cellular localization of our protein was found using UniProt and GeneCards, and AceView was used for isoform specific localization. To understand the expression of our protein we used AceView, HPRD, GeneCards and Ensembl. GeneCards was used to find common protein interactions, and it also provided specific information on those interactions.
-
Mutations and diseases of our protein was next studied. Common diseases were listed on AceView, and details on our proteins implications in these diseases were studied mostly through research articles. ClinVar, OMIM, DisGeNet, dbSNP and NCBI were used to research mutations of our protein and their implications.