Big Data And Social Media at University Of Zürich | Flashcards & Summaries

Lernmaterialien für Big Data and Social Media an der University of Zürich

Greife auf kostenlose Karteikarten, Zusammenfassungen, Übungsaufgaben und Altklausuren für deinen Big Data and Social Media Kurs an der University of Zürich zu.

TESTE DEIN WISSEN

Reidentification

Lösung anzeigen
TESTE DEIN WISSEN

Reidentification
– When you need the information, you need to be able to reidentify it.
– Necessity of Reidentification
• Sometimes, data has to reidentified in order to fully reconstruct the collected data.
• Reidentification has to be performed very carefully and we have to take care of privacy
of the objects in the data.

Lösung ausblenden
TESTE DEIN WISSEN

What is noise and signal?

Lösung anzeigen
TESTE DEIN WISSEN

Noise and signal: The major purpose of data analysis is to isolate signal from noise.

Lösung ausblenden
TESTE DEIN WISSEN

Sample size calculation: What is said to be an infinite population size?

Lösung anzeigen
TESTE DEIN WISSEN

over 400.000 samples

Lösung ausblenden
TESTE DEIN WISSEN

What is the difference between big and large data?

Lösung anzeigen
TESTE DEIN WISSEN

Two different large data sets
− Data set that only needs large storage and processing capacity (is not really Big Data) = large
data
− Big Data that needs to be formalized
− Metadata: a set of data that describes and gives information about other data

Lösung ausblenden
TESTE DEIN WISSEN

What 3 Major elements do you need to understand big data?

Lösung anzeigen
TESTE DEIN WISSEN

Search
Retrieval
Analysis

Lösung ausblenden
TESTE DEIN WISSEN

Difference between big and small data?

Goal, Location, Data structure, Data preparation, longevity, measurement, repoducibility, stakes, introspection, analysis;

Lösung anzeigen
TESTE DEIN WISSEN

Goal: Small Data to answer on particular question, big Data usually disigned with no goal in mind.

Location: one or several PCs/Severs vs. several large servers

data structure: highly structured and one discipline vs. unstructured and can be used by multiple disceplines

data preperation: one source or 3. party vs many different sources

longevity: average 7 years vs. eternally

mesurement: one standard protocol vs. many experimental protocols

reproduction: project can be repeated vs. not fasible in most cases

stakes: limited project costs vs. very high costs

introspection: spreadsheet vs. methode of introspection

analysis: at once vs. parallel processing

Lösung ausblenden
TESTE DEIN WISSEN

What is big Data? (V´s)

Lösung anzeigen
TESTE DEIN WISSEN

− Three V’s
1. Volume – how big?
2. Variety – how mixed, where is it
from?
3. Velocity – how does it change?
− Or four V’s
4. Veracity how trustworthy is it?
a. Umstritten, ob das dazu,
because every data needs to trustworthy, otherwise it isn´t small data but just
not credible

Lösung ausblenden
TESTE DEIN WISSEN

Sample size calculation: What is said to be an infinite population size?

Lösung anzeigen
TESTE DEIN WISSEN

over 400.000 samples

Lösung ausblenden
TESTE DEIN WISSEN

Data sampling methodologies

Lösung anzeigen
TESTE DEIN WISSEN

non-probability sampling

probability sampling:

  • simple random sample
  • sytematic sample (jeder 20.)
  • stratified sample (for subgroups/minorities)
  • cluster sample ()
Lösung ausblenden
TESTE DEIN WISSEN

What is the data about?

Lösung anzeigen
TESTE DEIN WISSEN

o

→ What topics are trending?
→ What topics are emerging?
→ What topics are becoming less important?
→ What are current risks?
→ What are current opportunities?
→ What is the attitude or sentiment?
→ How is the attitude and sentiment changing?
→ What should we expect next?

Lösung ausblenden
TESTE DEIN WISSEN

Big Data Definition schön?

Lösung anzeigen
TESTE DEIN WISSEN

Big data is facts and statistics collected from large, originally unstructured file formats which
are derived from various and sometimes unrelated sources, for reference or analysis.

Lösung ausblenden
TESTE DEIN WISSEN

What is the purpose of Data analysis?

Lösung anzeigen
TESTE DEIN WISSEN

Both small and Big Data analysis serve the same purpose: To detect the signal!

Lösung ausblenden
  • 59586 Karteikarten
  • 1032 Studierende
  • 10 Lernmaterialien

Beispielhafte Karteikarten für deinen Big Data and Social Media Kurs an der University of Zürich - von Kommilitonen auf StudySmarter erstellt!

Q:

Reidentification

A:

Reidentification
– When you need the information, you need to be able to reidentify it.
– Necessity of Reidentification
• Sometimes, data has to reidentified in order to fully reconstruct the collected data.
• Reidentification has to be performed very carefully and we have to take care of privacy
of the objects in the data.

Q:

What is noise and signal?

A:

Noise and signal: The major purpose of data analysis is to isolate signal from noise.

Q:

Sample size calculation: What is said to be an infinite population size?

A:

over 400.000 samples

Q:

What is the difference between big and large data?

A:

Two different large data sets
− Data set that only needs large storage and processing capacity (is not really Big Data) = large
data
− Big Data that needs to be formalized
− Metadata: a set of data that describes and gives information about other data

Q:

What 3 Major elements do you need to understand big data?

A:

Search
Retrieval
Analysis

Mehr Karteikarten anzeigen
Q:

Difference between big and small data?

Goal, Location, Data structure, Data preparation, longevity, measurement, repoducibility, stakes, introspection, analysis;

A:

Goal: Small Data to answer on particular question, big Data usually disigned with no goal in mind.

Location: one or several PCs/Severs vs. several large servers

data structure: highly structured and one discipline vs. unstructured and can be used by multiple disceplines

data preperation: one source or 3. party vs many different sources

longevity: average 7 years vs. eternally

mesurement: one standard protocol vs. many experimental protocols

reproduction: project can be repeated vs. not fasible in most cases

stakes: limited project costs vs. very high costs

introspection: spreadsheet vs. methode of introspection

analysis: at once vs. parallel processing

Q:

What is big Data? (V´s)

A:

− Three V’s
1. Volume – how big?
2. Variety – how mixed, where is it
from?
3. Velocity – how does it change?
− Or four V’s
4. Veracity how trustworthy is it?
a. Umstritten, ob das dazu,
because every data needs to trustworthy, otherwise it isn´t small data but just
not credible

Q:

Sample size calculation: What is said to be an infinite population size?

A:

over 400.000 samples

Q:

Data sampling methodologies

A:

non-probability sampling

probability sampling:

  • simple random sample
  • sytematic sample (jeder 20.)
  • stratified sample (for subgroups/minorities)
  • cluster sample ()
Q:

What is the data about?

A:

o

→ What topics are trending?
→ What topics are emerging?
→ What topics are becoming less important?
→ What are current risks?
→ What are current opportunities?
→ What is the attitude or sentiment?
→ How is the attitude and sentiment changing?
→ What should we expect next?

Q:

Big Data Definition schön?

A:

Big data is facts and statistics collected from large, originally unstructured file formats which
are derived from various and sometimes unrelated sources, for reference or analysis.

Q:

What is the purpose of Data analysis?

A:

Both small and Big Data analysis serve the same purpose: To detect the signal!

Big Data and Social Media

Erstelle und finde Lernmaterialien auf StudySmarter.

Greife kostenlos auf tausende geteilte Karteikarten, Zusammenfassungen, Altklausuren und mehr zu.

Jetzt loslegen

Die all-in-one Lernapp für Studierende

Greife auf Millionen geteilter Lernmaterialien der StudySmarter Community zu
Kostenlos anmelden Big Data and Social Media
Erstelle Karteikarten und Zusammenfassungen mit den StudySmarter Tools
Kostenlos loslegen Big Data and Social Media