Beispielfach at TU Kaiserslautern | Flashcards & Summaries

Select your language

Suggested languages for you:
Log In Start studying!

Lernmaterialien für Beispielfach an der TU Kaiserslautern

Greife auf kostenlose Karteikarten, Zusammenfassungen, Übungsaufgaben und Altklausuren für deinen Beispielfach Kurs an der TU Kaiserslautern zu.

TESTE DEIN WISSEN

Article Summarisation

Luhn's Algorithm

Lösung anzeigen
TESTE DEIN WISSEN

● Pick a sentence with many frequent words in it
● Based on sentence detection and frequency analysis

Lösung ausblenden
TESTE DEIN WISSEN

Web 2.0 Principles

Lösung anzeigen
TESTE DEIN WISSEN

● Web-based applications
● Reach out for the long tail
● Data is central
● Network effects from user contributions
● Lightweight programming models

Lösung ausblenden
TESTE DEIN WISSEN

Article Feeds

Lösung anzeigen
TESTE DEIN WISSEN

● Fully adopted by convention
● Machine-readable list of recent articles

  • Along with meta data of the site and the articles

● RSS: Really Simple Syndication
● Atom as an alternative modern variant

Lösung ausblenden
TESTE DEIN WISSEN

Article Summarisation

Entity Extraction

Lösung anzeigen
TESTE DEIN WISSEN

● Opposed to document-centric analysis
● Noun-based recognition is a starting point

  • Look for collocations of NN-tagged tokens
  • Tokens are:
    • Dr./NNP
    • (PERSON Obradovic/NNP)
    • proposed/VBD
    • the/DT
    • lecture/NN
    • (PERSON Social/NNP Web/NNP)
    • Mining/NNP
Lösung ausblenden
TESTE DEIN WISSEN

Data Sampeling

Typical Challenges


Lösung anzeigen
TESTE DEIN WISSEN

● Space is a limited resource in many cases

  • Set your sampling goals based on a hypothesis

● Time is mostly restricted by network latency

  • Use parallelised HTTP queries or a thread pool

● One-time generated data sets vs. Monitoring

Lösung ausblenden
TESTE DEIN WISSEN

Clustering Basics

Hierarchical Clustering

Lösung anzeigen
TESTE DEIN WISSEN

● Agglomerative clustering

  • Bottom-up approach
  • Full matrix computation, iterative linking
  • O(n2 log n) in the general case
  • Deterministic

● opposed to Divisive Clustering

  • O(2n) in the general case
Lösung ausblenden
TESTE DEIN WISSEN

REST Architectural Constraints

Lösung anzeigen
TESTE DEIN WISSEN

• Client-Server (client does not care about server-side storage, server does not care about client state)
• Stateless (client sends requests for state transition)
• Cacheable
• Layered System (proxies, gateways, firewalls)
• Code on demand (optional)
• Uniform Interface

Lösung ausblenden
TESTE DEIN WISSEN

REST Architectural Properties

Lösung anzeigen
TESTE DEIN WISSEN

• Performance
• Scalability
• Simplicity
• Modifiability
• Visibility
• Portability
• Reliability

Lösung ausblenden
TESTE DEIN WISSEN

Our Practical Definition

Lösung anzeigen
TESTE DEIN WISSEN

Social Web = Social Media + Social Networking

Lösung ausblenden
TESTE DEIN WISSEN

Clustering Basics

Normalising Data

Lösung anzeigen
TESTE DEIN WISSEN

● Unify diverse variations of the same entity
● Unify highly similar entities when desired
● Usually heuristic approaches
● Highly domain specific and customized solutions

Lösung ausblenden
TESTE DEIN WISSEN

Clustering Basics

Edit Distance


Lösung anzeigen
TESTE DEIN WISSEN

● Typically Levenshtein Distance based on operations

  • Insert, delete or replace a character counts as 1

● nltk.metrics.distance.edit_distance(a, b)
● Computational cost is O(m*n) per pair


Lösung ausblenden
TESTE DEIN WISSEN

Clustering Basics

Principle Idea

Lösung anzeigen
TESTE DEIN WISSEN

● Partition of a collection
- Based on a certain property
- Usually expressed with a similarity function
● Data Normalisation
- Bring all the data into the required format
● Dimensionality Reduction
- Reduce dimensions for clustering with 10 or more features


Lösung ausblenden
  • 45410 Karteikarten
  • 1065 Studierende
  • 78 Lernmaterialien

Beispielhafte Karteikarten für deinen Beispielfach Kurs an der TU Kaiserslautern - von Kommilitonen auf StudySmarter erstellt!

Q:

Article Summarisation

Luhn's Algorithm

A:

● Pick a sentence with many frequent words in it
● Based on sentence detection and frequency analysis

Q:

Web 2.0 Principles

A:

● Web-based applications
● Reach out for the long tail
● Data is central
● Network effects from user contributions
● Lightweight programming models

Q:

Article Feeds

A:

● Fully adopted by convention
● Machine-readable list of recent articles

  • Along with meta data of the site and the articles

● RSS: Really Simple Syndication
● Atom as an alternative modern variant

Q:

Article Summarisation

Entity Extraction

A:

● Opposed to document-centric analysis
● Noun-based recognition is a starting point

  • Look for collocations of NN-tagged tokens
  • Tokens are:
    • Dr./NNP
    • (PERSON Obradovic/NNP)
    • proposed/VBD
    • the/DT
    • lecture/NN
    • (PERSON Social/NNP Web/NNP)
    • Mining/NNP
Q:

Data Sampeling

Typical Challenges


A:

● Space is a limited resource in many cases

  • Set your sampling goals based on a hypothesis

● Time is mostly restricted by network latency

  • Use parallelised HTTP queries or a thread pool

● One-time generated data sets vs. Monitoring

Mehr Karteikarten anzeigen
Q:

Clustering Basics

Hierarchical Clustering

A:

● Agglomerative clustering

  • Bottom-up approach
  • Full matrix computation, iterative linking
  • O(n2 log n) in the general case
  • Deterministic

● opposed to Divisive Clustering

  • O(2n) in the general case
Q:

REST Architectural Constraints

A:

• Client-Server (client does not care about server-side storage, server does not care about client state)
• Stateless (client sends requests for state transition)
• Cacheable
• Layered System (proxies, gateways, firewalls)
• Code on demand (optional)
• Uniform Interface

Q:

REST Architectural Properties

A:

• Performance
• Scalability
• Simplicity
• Modifiability
• Visibility
• Portability
• Reliability

Q:

Our Practical Definition

A:

Social Web = Social Media + Social Networking

Q:

Clustering Basics

Normalising Data

A:

● Unify diverse variations of the same entity
● Unify highly similar entities when desired
● Usually heuristic approaches
● Highly domain specific and customized solutions

Q:

Clustering Basics

Edit Distance


A:

● Typically Levenshtein Distance based on operations

  • Insert, delete or replace a character counts as 1

● nltk.metrics.distance.edit_distance(a, b)
● Computational cost is O(m*n) per pair


Q:

Clustering Basics

Principle Idea

A:

● Partition of a collection
- Based on a certain property
- Usually expressed with a similarity function
● Data Normalisation
- Bring all the data into the required format
● Dimensionality Reduction
- Reduce dimensions for clustering with 10 or more features


Beispielfach

Erstelle und finde Lernmaterialien auf StudySmarter.

Greife kostenlos auf tausende geteilte Karteikarten, Zusammenfassungen, Altklausuren und mehr zu.

Jetzt loslegen

Das sind die beliebtesten Beispielfach Kurse im gesamten StudySmarter Universum

Fallbeispiele

Semmelweis University of Medical Sciences

Zum Kurs
Beispiel Lernset

Universität Frankfurt am Main

Zum Kurs
Beispiel Lernset

Musikhochschule Lübeck

Zum Kurs
Beispiel Lernset

Universität Stuttgart

Zum Kurs
Beispiel Lernset

Technische Hochschule Köln

Zum Kurs

Die all-in-one Lernapp für Studierende

Greife auf Millionen geteilter Lernmaterialien der StudySmarter Community zu
Kostenlos anmelden Beispielfach
Erstelle Karteikarten und Zusammenfassungen mit den StudySmarter Tools
Kostenlos loslegen Beispielfach