Introduction to bioinformatics at Universität für Bodenkultur Wien

Flashcards and summaries for Introduction to bioinformatics at the Universität für Bodenkultur Wien

Arrow Arrow

It’s completely free

studysmarter schule studium
d

4.5 /5

studysmarter schule studium
d

4.8 /5

studysmarter schule studium
d

4.5 /5

studysmarter schule studium
d

4.8 /5

Study with flashcards and summaries for the course Introduction to bioinformatics at the Universität für Bodenkultur Wien

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

16. What is k-means clustering?

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

24. What is the difference between heuristic and dynamic  
methods and when do we use which?

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

6. Describe the classification of proteins based on secondary structure!

This was only a preview of our StudySmarter flashcards.
Flascard Icon Flascard Icon

Millions of flashcards created by students

Flascard Icon Flascard Icon

Create your own flashcards as quick as possible

Flascard Icon Flascard Icon

Learning-Assistant with spaced repetition algorithm

Sign up for free!

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

2. Explain the two experimental methods to determine protein
structure! What parameters are measured? How do we get the
structure?

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

18. What is the molecular clock hypothesis and how is it used?

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

20. Is regression a supervised or an unsupervised method?

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

10. Explain 3 non-experimental methods to get protein models!
Why do they work?

This was only a preview of our StudySmarter flashcards.
Flascard Icon Flascard Icon

Millions of flashcards created by students

Flascard Icon Flascard Icon

Create your own flashcards as quick as possible

Flascard Icon Flascard Icon

Learning-Assistant with spaced repetition algorithm

Sign up for free!

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

11. Explain the Ramachandran plot and how it can be used to
evaluate models!

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

12. Explain the b-barrel!

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

13. Is clustering a supervised or an unsupervised method?

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

17. Calculate the likelihood of a new sequence via PSSM!

This was only a preview of our StudySmarter flashcards.
Flascard Icon Flascard Icon

Millions of flashcards created by students

Flascard Icon Flascard Icon

Create your own flashcards as quick as possible

Flascard Icon Flascard Icon

Learning-Assistant with spaced repetition algorithm

Sign up for free!

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

25. What are the 4 amino acids that are found in the core of
globular proteins?

Your peers in the course Introduction to bioinformatics at the Universität für Bodenkultur Wien create and share summaries, flashcards, study plans and other learning materials with the intelligent StudySmarter learning app.

Get started now!

Flashcard Flashcard

Exemplary flashcards for Introduction to bioinformatics at the Universität für Bodenkultur Wien on StudySmarter:

Introduction to bioinformatics

16. What is k-means clustering?

The k-means algorithm is an iterative procedure and depends on the (randomly)
chosen starting values.



Algorithm:
• Initially (at step 0), choose k observations by random; these points represent
the initial cluster centroids
• Then calculate the distances of each object to all centroids and assign it to a
cluster which has the nearest centroid
• When all objects have been assigned, recalculate the centroids of the k
clusters
• Repeat the last two steps until a maximum number of iterations is reached or
the centroids no longer change



k-means clustering:
• Classification of data through a single step partition
• Divisive approach (all data into single cluster and then dividing into smaller
groups according to similarity)

Introduction to bioinformatics

24. What is the difference between heuristic and dynamic  
methods and when do we use which?

 Exhaustive dynamic programming takes very long (time consuming due to complexity
and computational intensity) but are very accurate.
Heuristic methods are faster but not that detailed.



Dynamic programming:
• Needleman-Wunsch (global)
• Smith-Waterman (local)



Heuristic programming:
• Pairwise (word method for fast sequence alignment)
o BLAST
o FASTA
• Multiple sequence alignemtn
o Progressive
o Iterative
o Blockwise

Introduction to bioinformatics

6. Describe the classification of proteins based on secondary structure!

To know the relationship among the structures (hierarchical classification system):
c. Remove redundancy from databases
d. Separate structurally distinct domains within the structure (manually or with
algorithms)
e. Grouping proteins/domains of similar structures and clustering them



According to Levitt and Chothia, domain structures can be classified into 3 main
classes:
• a-domains: core built up exclusively from a-helices
• b-domains: usually 2 antiparallel b-sheets packed against each other
• a/b-domains: combinations of b-a-b motifs; parallel b-sheets surrounded by
a-helices
Two main databases:
a) SCOP:
• Based on manual examination of structures
• Grouped in: classes, folds, superfamilies and families
• Classes consist of fold with similar core structure



b) CATH:
• Proteins are classified based on automatic structural alignment
program, and manual comparison
• Grouped in: class, architecture, Topology, homologous superfamily,
homologous family


Introduction to bioinformatics

2. Explain the two experimental methods to determine protein
structure! What parameters are measured? How do we get the
structure?

a. X-Ray Crystallography:
Proteins need to be grown into large crystal, in which their position is fixed.
Sending x-rays, the x-rays are deflected by the electron clouds surrounding
the atoms in the crystal, producing a regular pattern of diffraction. The
diffraction patterns can be converted into an electron map using Fourier
transformation. Parameters: Phase in diffraction data



b. NMR Spectroscopy:
It is based on the detection of spinning patterns in atomic nuclei in a magnetic
field. Protein samples are labelled with radioactive C13 and N15 isotopes. The
radio frequency radiation induces transition between nuclear spin states and
the radio signals can be interpreted. The proximity and distance between
labelled atoms can be determined

Introduction to bioinformatics

18. What is the molecular clock hypothesis and how is it used?

This hypothesis is used to estimate the time of occurrence of speciation or mutation
events by using fossil evidence or DNA/protein sequences. If molecular evolve at
constant rates, the amount of accumulated mutations is proportional to evolutionary
time. But the problem is that uniformity of evolutionary rates is rarely found because
of:
• Changing generation times
• Population size
• Species-specific differences
• Evolving functions of the encoded protein
• Changes in the intensity of natural selection

The “strict” clock assumes perfectly constant rates of evolution whereas the
“relaxed” clock uses different evolutionary rates on different branches.
Calibration: Individual molecular clocks can be tested for accuracy, they need to be
calibrated against material evidence, such as fossils. Over long time spans, estimates
can be off by 50% or more.

Introduction to bioinformatics

20. Is regression a supervised or an unsupervised method?

Regression is a supervised method. Supervised classification is the classification of
data into a set of predefined categories.

Introduction to bioinformatics

10. Explain 3 non-experimental methods to get protein models!
Why do they work?

a. Homology modelling:
Homology modelling relies on previous knowledge and the structure is based
on sequence homology. If proteins share a high enough sequence similarity,
they are likely to have a similar 3D-structure. The production of an all-atom
model is based on alignment with template proteins:
• Search for homologous proteins in the database
• Align sequences
• Determine structurally conserved regions
• Determine coordinates

b. Protein threading:
Protein threading predicts the fold of an unknown protein sequence by fitting
the sequence into a structural database and selecting the best fitting model,

c. Ab initio structure prediction:
It is based on a single query sequence and measures the relative propensity of
each amino acid belonging to a certain secondary structure element (scores
derived from crystal structures). Statistical programs then predict the
secondary structure elements.

Introduction to bioinformatics

11. Explain the Ramachandran plot and how it can be used to
evaluate models!

The Ramachandran plot is a computer model that allows us to visualize the
energetically stable conformations of the bond angles of j against f for each of the
amino acids in a protein structure.
Rotation of the polypeptide backbone is limited to two angles (because of the planar
structure). The plot writes j and f against each other and maps the entire
conformational space of a peptide and shows allowed and disallowed regions. j is
not allowed to be 0 degrees because two oxygen molecules would bump into each
other. If j and f are 0 degrees, a hydrogen and an oxygen molecule would bump
into each other. The most stable conformation is at 180 degrees.

Introduction to bioinformatics

12. Explain the b-barrel!

d. Closed barrel:
The closed barrel has a simple structure – each successive b-strand is added
next to the previous b-strand until the last one is joined by hydrogen
bonds to the first b-strand. The strands are antiparallel and connected by
hairpins. They’re often hydrophilic inside and hydrophobic outside.



e. Jelly roll barrel:
The jelly roll structure consists of 8 b-strands arranged in two four-stranded
antiparallel b-sheets that pack together across a hydrophobic interface



f. TIM barrel:
The TIM barrel is a conserved protein fold consisting of 8 a-helices and
1. b-sheets (parallel) along the peptide backbone.

Introduction to bioinformatics

13. Is clustering a supervised or an unsupervised method?

Clustering is an unsupervised method. It does not assume predefined categories and
it identifies data categories according to similar patterns. When clustering the group
patterns get turned into clusters of genes with correlated profiles.

Introduction to bioinformatics

17. Calculate the likelihood of a new sequence via PSSM!

 

Multiple sequence
alignment
Pos. 1 2 3 4 5 6
S1 A T G T C G
S2 A A G A C T
S3 T A C T C A
S4 C G G A G G
S5 A A C C T G

 



Calculate observation
frequencies for all
residues at every
position and in total

Pos. 1 2 3 4 5 6 all
A 0,6 0,6 - 0,4 - 0,2 0,3
T 0,2 0,2 - 0,4 0,2 0,2 0,2
G - 0,2 0,6 - 0,2 0,6 0,27
C 0,2 - 0,4 0,2 0,6 - 0,23
 Normalize the
frequencies with the
occurrence of the
residue

Take the 2log of
the probabilities
(ln(x)/ln(2))

Calculate the new likelihood of a sequence, e.g. AACTCG



Add up the log odd
scores of the correct
residues at every pos.







1 + 1 + 0,8 + 1 + 1,38 + 1,15 = 6,33



à it is 26,33 = 80 times more likely than by random chance that this sequence fits into
to PSSM

Pos. 1 2 3 4 5 6
A 2 2 - 1,33 - 0,67
T 1 1 - 2 1 1
G - 0,74 2,22 - 0,74 2,22
C 0,87 - 1,74 0,87 2,61 -

Pos. 1 2 3 4 5 6
A 1 1 - 0,4 - 0,7
T 0 0 - 1 0 0
G - -0,43 1,15 - -0,43 1,15
C -0,2 - 0,8 -0,2 1,38 -

Pos. 1 2 3 4 5 6
A 1 1 - 0,4 - 0,7
T 0 0 - 1 0 0
G - -0,43 1,15 - -0,43 1,15
C -0,2 - 0,8 -0,2 1,38 -



Introduction to bioinformatics

25. What are the 4 amino acids that are found in the core of
globular proteins?

• Leucine à participating in hydrophobic interactions
• Isoleucine à participating in hydrophobic interactions
• Methionine à non reactive side chains
• Valine à non reactive side chains

• Phenalynine à aromatic interactions (p-stacking)
• Tyrosine à aromatic interactions (p-stacking)

Sign up for free to see all flashcards and summaries for Introduction to bioinformatics at the Universität für Bodenkultur Wien

Singup Image Singup Image
Wave

Other courses from your degree program

For your degree program Introduction to bioinformatics at the Universität für Bodenkultur Wien there are already many courses on StudySmarter, waiting for you to join them. Get access to flashcards, summaries, and much more.

Back to Universität für Bodenkultur Wien overview page

Prozesstechnik

Introduction to Molecular Biology

AC Vorlesung

Molekularbiologische Übungen EPIGENETIK

Molekularbiologie Übungen E.COLI

Mikrobiologie

Introduction to Melocular Biology

Thermodynamik

Qualitätsmanagement

Einführung in die Lebensmittel

WSÜ Theoriefragen

Introduction to molecular biology

Introduction To Molecular Biology Altfragen

Mess- und Regeltechnik

Italienisch

Hygiene

Einführung in die Zellbiologie und Genetik

Einführung in die Chemie

Prozesstechnik VU - Technisches Zeichnen

Organische Chemie

STEOP LBT

Allgemeine Mirkobiologie

hygiene_neu

Angewandte Mikrobiologie UE

Angewandte Mikrobiologie

Einführung in die Prozesstechnik 3.TL

Physik LBT

Introduction to Statistics at

Nigerian Turkish Nile University

Introduction to Linguistics at

Universität Frankfurt am Main

Introduction to Linguistics at

TU Braunschweig

Introduction to Linguistics at

Universität Freiburg im Breisgau

Introduction to Linguistics at

Albert-Ludwigs-Universität Freiburg

Similar courses from other universities

Check out courses similar to Introduction to bioinformatics at other universities

Back to Universität für Bodenkultur Wien overview page

What is StudySmarter?

What is StudySmarter?

StudySmarter is an intelligent learning tool for students. With StudySmarter you can easily and efficiently create flashcards, summaries, mind maps, study plans and more. Create your own flashcards e.g. for Introduction to bioinformatics at the Universität für Bodenkultur Wien or access thousands of learning materials created by your fellow students. Whether at your own university or at other universities. Hundreds of thousands of students use StudySmarter to efficiently prepare for their exams. Available on the Web, Android & iOS. It’s completely free.

Awards

Best EdTech Startup in Europe

Awards
Awards

EUROPEAN YOUTH AWARD IN SMART LEARNING

Awards
Awards

BEST EDTECH STARTUP IN GERMANY

Awards
Awards

Best EdTech Startup in Europe

Awards
Awards

EUROPEAN YOUTH AWARD IN SMART LEARNING

Awards
Awards

BEST EDTECH STARTUP IN GERMANY

Awards