Data mosaic, data analysis and relevance in advertising

machine-learning
data-mosaic
data-mining

(Adavidoaiei Dumitru-Cornel) #1

CEO nostru din UK ne-a vorbit de data mosaic, in ce consta el, in colectarea de date despre posibili clienti din retele de socializare, preferinte, like-uri alte surse si pe baza acestui data mosaic si folosind data analysis sa putem construi un profil al clientului si a afisa advert-uri relevante pentru client-ul respectiv, acestea sunt afisate pe screen-uri in magazine, pub si alte locuri, si apropo de algoritmi @flavius trebuie sa se bazeze pe niste algoritmi care au ca scop sa afisezi acele reclame cuiva care e interesat de produsul respectiv ca reclama sa vanda si cel care plateste serviciul sa fie multumit.
Din cate am vorbit cu CEO, data mosaic e un lucru pe care mai multe companii incearca sa il construiasca: google, amazon, facebook, microsoft, ibm, si desigur aici intervine big data.


Firma IT si-a testat angajatii cu subiectele de la Bac #cringe
(Ex. Dakull) #2

you mean this?

… there is the threat that disparate threads can be pieced together in a way that yields information that is supposed to be private.

via http://fcw.com/articles/2014/05/13/fose-mosaic.aspx

This kind of analysis through the combination of big data sets is called the mosaic effect.

This sounds awful.


(Adavidoaiei Dumitru-Cornel) #3

Depinde ce intelegi prin asta, ideea e ca nu folosim date private fara acordul clientului, dar putem forta clientul sa se logheze in magazine folosind social network accounts and wifi authentificated si folosind strategii similare sa culegem date despre client.

Legat de
http://fcw.com/articles/2014/05/13/fose-mosaic.aspx

Nu expunem aceste date ele raman private la nivelul companiei si sunt folosite in scopuri comerciale, nu vad nimica rau in asta.

Nu cred ca ai dat cel mai bun exemplu guvernul, probabil cel mai relevant mi se pare facebook care culege cele mai multe date despre noi si google pe acelas nivel fiindca discutam despre ceva comercial si de a vinde ceva.


(Ex. Dakull) #4

e.g.: daca iau lista de events/check-ins al unui user de pe FB. pot “triangula” in ce zona cartier locuieste - that is private information and is creepy as fuck.

Thanks Allah that at least I’m not merely a consumer but the worm that helps the gears move thus I can have a wee bit more insight about stuff than the average Joe.

tl;dr: We’re already living in an Orwellian utopy of sorts :slight_smile:


(Adavidoaiei Dumitru-Cornel) #5

Common people, nu fiti asa de paranoici, sunt la un beer marathon


(Ex. Dakull) #6

Cheers!

** somewhere an ad is being tailored specifically to your location and beer allegiance :slight_smile: **


(Adavidoaiei Dumitru-Cornel) #7

data mosaic, poate avea aplicatii interesante in combaterea criminalitatii si terorismului, oamenii intotdeauna au de ales cum au avut in trecut sa construiasca bombe nucleare sa distruga sau centrale atomice sa aiba energie


(Ex. Dakull) #8

btw. se pare ca nu sunt unicul paranoic care se gandeste la implicatiile etice big data :sunny:

The Internet of Things is pitched as good for the consumer. But is it? At this point, it seems exceptionally awesome for those companies working on products for it. The benefit to the average homeowner pales dramatically in relation to the benefit for the companies poised to accumulate infinite amounts of actionable data. You and I benefit by determining whether our dog got enough exercise last Wednesday. Is that a fair tradeoff? Doesn’t feel like it.

via http://www.nytimes.com/2015/09/06/opinion/sunday/allison-arieff-the-internet-of-way-too-many-things.html


(Adavidoaiei Dumitru-Cornel) #11

Big data e o chestie care ne ajuta sa intelegem mai bine anumite fenomene, dar ca orice lucru poate fi folosit in scopuri bune si rele, de aceea imi place open source ca in general acolo intentile sunt mai clare si vizibile, deschise la fel ca sursele :smile: .


(Adavidoaiei Dumitru-Cornel) #12

(Adavidoaiei Dumitru-Cornel) #13

Mi se pare interesant ca poti accesa un snapshot al web-ului free de la Common Crawl:

The crawl archive for June 2015 is now available! This crawl archive is over 145TB in size and holds more than 1.81 billion webpages. The files are located in the aws-publicdatasets bucket at /common-crawl/crawl-data/CC-MAIN-2015-32/.

Link-uri Common Crawl la arhiva pe ani de date la Amazon S3:
http://commoncrawl.org/the-data/get-started/

[Other links: Machine Learning Data Set Collections]
(http://www.datasciencecentral.com/m/blogpost?id=6448529%3ABlogPost%3A341263)

[Link ceva mai vechi dar de actualitate/interesant despre cum Google colecteza date]
(http://royal.pingdom.com/2010/01/08/how-google-collects-data-about-you-and-the-internet/)

O clasificare a algoritmilor de Machine Learning sper sa dezvolt subiectul:

Machine Learning
1.) supervised learning
i.) binary classification (Example: learns from mark email spam nospam, binary because you have two options(category) in classification)
ii.) multiple classification (Example: you can classify a flower based on sepal, petal width, length into a category: setosa, iris, …, on previous learned data, multiple because you have multiple categories in classification)
iii.) regresion (Temperature forecast for a day on previous stored data)
2.) unsupervised learning - clustering (Example: documents classification in n categories)

O trecere in revista gasiti si la:
http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/

Eu am avut ca si bibliografie pentru prezentare cursul care se pare ca nu poate fi accesat din pacate decat daca ai cont platit, sectiunea publica a site-ului pare in lucru:


(Adavidoaiei Dumitru-Cornel) #14

In principiu sa construiesti un data mosaic inseamna sa construiesti un machine learning model care sa faca predictii si pentru asta ai nevoie de date si de un algoritm de machine learning.


(Adavidoaiei Dumitru-Cornel) #15

@dakull sunt de acord ca trebuie sa existe etica in data mosaic, machine learning si big data eu incerc sa inteleg cum functioneaza chestiile.