2024-03-29T15:13:47Zhttp://oai-repositori.upf.edu/oai/requestoai:repositori.upf.edu:10230/335112021-12-14T12:32:33Zcom_10230_26995com_10230_16441col_10230_26998
Garriga Calleja, Roger
Mas Adell, Javier
Poudel, Saurav
2017-12-15T12:58:56Z
2017-12-15T12:58:56Z
2017
http://hdl.handle.net/10230/33511
Treball fi de màster de: Master's Degree in Data Science. Curs 2016-2017
Director: Christian Fons-Ronsen
Internet has seen a tremendous growth in the last few years. Because of that, we have a lot of information about most of the things in web. And the usage of Recommendation system has become more important than ever.
Recommendation systems address this problem, by guiding users through the big ocean of information. Until now, recommendation systems have been extensively used within e-commerce and communities where items like movies, music and articles are
recommended. More recently, recommendation systems have been deployed in online music players, recommending music that the users probably will like.
This thesis will present the design, implementation, testing and evaluation of a recommendation system within the music domain, where three different approaches for producing recommendations are utilized.
Testing each approach is done by evaluating the recommendation systems using precision scores. Our results show that the functionality of the recommendation system is satisfactory, and that recommendation precision differs for the three filtering approaches.
Internet ha experimentat un gran creixement en els últims anys. Per això, tenim molta informació sobre la majoria de les coses a la web. I l'ús del sistema de recomanacions s'ha tornat més important que mai.
Els sistemes de recomanació tracten aquest problema, guiant els usuaris a través del gran oceà d'informació. Fins ara, els sistemes de recomanacions s'han utilitzat àmpliament en el comerç electrònic i comunitats on es recomana articles com ara pel·lícules, música i articles. Més recentment, els sistemes de recomanació s'han desplegat en reproductors de música en línia, recomanant música
que probablement els usuaris agradin.
Aquesta tesi presentarà el disseny, implementació, avaluació i avaluació d'un sistema de recomanacions dins del domini musical, on s'utilitzen tres criteris diferents per a la producció de recomenacions.
La prova de cada enfocament es fa avaluant els sistemes de recomanació utilitzant puntuacions de precisió. Els nostres resultats mostren que la funcionalitat del sistema de recomanacions és satisfactòria i que la recomanació prediu els diàmetres dels tres enfocaments de filtració.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2017-12-15T12:58:55Z
No. of bitstreams: 2
GarrigaTFMDS2017.pdf: 492122 bytes, checksum: 5e099400a8bb160fb8cf78ad9de6db8e (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
Made available in DSpace on 2017-12-15T12:58:56Z (GMT). No. of bitstreams: 2
GarrigaTFMDS2017.pdf: 492122 bytes, checksum: 5e099400a8bb160fb8cf78ad9de6db8e (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2016-2017
A data science game: assessing music recommendation with factorization machines
info:eu-repo/semantics/masterThesis
THUMBNAIL
GarrigaTFMDS2017.pdf.jpg
GarrigaTFMDS2017.pdf.jpg
IM Thumbnail
image/jpeg
9986
http://repositori.upf.edu/bitstream/10230/33511/4/GarrigaTFMDS2017.pdf.jpg
3b53ca58313fa2290b48ae99834f232a
MD5
4
TEXT
GarrigaTFMDS2017.pdf.txt
GarrigaTFMDS2017.pdf.txt
Extracted text
text/plain
52200
http://repositori.upf.edu/bitstream/10230/33511/3/GarrigaTFMDS2017.pdf.txt
d28fb4665ed6ffbd18ec22cb4550c372
MD5
3
CC-LICENSE
license_rdf
license_rdf
application/rdf+xml; charset=utf-8
1232
http://repositori.upf.edu/bitstream/10230/33511/2/license_rdf
b51f25f83cca752633b6ec4c418dbcc7
MD5
2
ORIGINAL
GarrigaTFMDS2017.pdf
GarrigaTFMDS2017.pdf
application/pdf
492122
http://repositori.upf.edu/bitstream/10230/33511/1/GarrigaTFMDS2017.pdf
5e099400a8bb160fb8cf78ad9de6db8e
MD5
1
10230/33511
oai:repositori.upf.edu:10230/33511
2021-12-14 13:32:33.55
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/335122021-12-14T12:02:09Zcom_10230_26995com_10230_16441col_10230_26998
Höllwirth, Hans-Peter
2017-12-15T13:03:59Z
2017-12-15T13:03:59Z
2017
http://hdl.handle.net/10230/33512
Treball fi de màster de: Master's Degree in Data Science. Curs 2016-2017
Director: Christian Brownlees
This paper studies a novel particle filter method proposed by Brownlees and Kristensen (2017) for parameter estimation of nonlinear state space models. The particle filter, named Importance Sampling Particle Filter, is tested and compared to other established particle filters on two variations of a local level model. Inspections of the log-likelihood plots with respect to model parameters, as well as Monte Carlo maximum likelihood estimations establish the correctness of the new method. Finally, the novel particle filter is successfully applied to a nonlinear state space model, the hierarchical dynamic Poisson model.
Aquest article estudia un nou mètode de filtre de partícules proposat per Brownlees i Kristensen (2017) per a l'estimació de paràmetres de models no lineals d'espai estatal. El filtre de partícules, anomenat Importance Sampling Particle Filter, es prova i es compara amb altres filtres de partícules establerts en dues variacions d'un model de nivell local. Les inspeccions de les trames de risc de registre respecte als paràmetres del model, així com les estimacions de màxima versemblança de Monte Carlo, estableixen la correcció del nou mètode. Finalment, el filtre de partícules nou s'aplica amb èxit a un model d'espai estatal no lineal, el model Poisson dinàmic jeràrquic.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2017-12-15T13:03:59Z
No. of bitstreams: 2
HollwirthTFMDS2017.pdf: 1328253 bytes, checksum: c2477fab574c736cd1cfd262bef50353 (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
Made available in DSpace on 2017-12-15T13:03:59Z (GMT). No. of bitstreams: 2
HollwirthTFMDS2017.pdf: 1328253 bytes, checksum: c2477fab574c736cd1cfd262bef50353 (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2016-2017
Particle filtering for nonlinear state space models
info:eu-repo/semantics/masterThesis
THUMBNAIL
HollwirthTFMDS2017.pdf.jpg
HollwirthTFMDS2017.pdf.jpg
IM Thumbnail
image/jpeg
8922
http://repositori.upf.edu/bitstream/10230/33512/4/HollwirthTFMDS2017.pdf.jpg
05452accbef7a6d2aa209c14d3f94cc4
MD5
4
TEXT
HollwirthTFMDS2017.pdf.txt
HollwirthTFMDS2017.pdf.txt
Extracted text
text/plain
49462
http://repositori.upf.edu/bitstream/10230/33512/3/HollwirthTFMDS2017.pdf.txt
e6920b7d4838e82b8f9a70912a0ec9a9
MD5
3
CC-LICENSE
license_rdf
license_rdf
application/rdf+xml; charset=utf-8
1232
http://repositori.upf.edu/bitstream/10230/33512/2/license_rdf
b51f25f83cca752633b6ec4c418dbcc7
MD5
2
ORIGINAL
HollwirthTFMDS2017.pdf
HollwirthTFMDS2017.pdf
application/pdf
1328253
http://repositori.upf.edu/bitstream/10230/33512/1/HollwirthTFMDS2017.pdf
c2477fab574c736cd1cfd262bef50353
MD5
1
10230/33512
oai:repositori.upf.edu:10230/33512
2021-12-14 13:02:09.684
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/335132021-12-14T12:00:21Zcom_10230_26995com_10230_16441col_10230_26998
Lange, Robert Tjarko
2017-12-15T13:07:46Z
2017-12-15T13:07:46Z
2017
http://hdl.handle.net/10230/33513
Treball fi de màster de: Master's Degree in Data Science. Curs 2016-2017
Directors: Prof. Ioannis Kosmidis (UCL), i Prof. Omiros Paspapiliopoulos (UPF)
Data scientific questions face the fundamental trade-off between complexity, generalizability and computational feasibility. The need for quick estimation and evaluation of a vast amount of statistical models has given rise to a plethora of new and innovative algorithms in the field of randomized numerical linear algebra (RandNLA). They intend to decrease effective running time by approximating exact solutions. One commonly allows for some e-"slack" in order to make use of powerful subspace embedding ideas such as the Johnson-Lindenstrauss transform (JLT). In this way, one is able to significantly reduce the dimensionality of the problem, while preserving a substantial amount of the original structure. Petros Drineas and Michael Mahoney have been applying these ideas to a range of problems such as solving linear systems of equations (over-and under-constrained), matrix completion and low-rank matrix approximation.
Les qüestions científiques de dades afronten el compromís fonamental entre complexitat, generalizabilitat i viabilitat computacional. La necessitat d'una ràpida estimació i avaluació d'una gran quantitat de models estadístics ha donat lloc a una infinitat d'algorismes nous i innovadors en el camp de l'àlgebra lineal numèrica aleatòria (RandNLA). Tenen la intenció de disminuir el temps d'execució efectiu mitjançant l'aproximació de solucions exactes. Un comunament permet una mica de "descentralització" per a fer ús d'idees incrustantes potents de subespacio com la transformació Johnson-Lindenstrauss (JLT). D'aquesta manera, es pot reduir significativament la dimensionalitat del problema, tot conservant una quantitat substancial de l'estructura original. Petros Drineas i Michael Mahoney han estat aplicant aquestes idees a una sèrie de problemes com la solució de sistemes lineals d'equacions (sobre i subterfugis), la finalització de la matriu i la aproximació a la matriu de baix rang.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2017-12-15T13:07:46Z
No. of bitstreams: 2
LangeTFMDS2017.pdf: 552783 bytes, checksum: 93814ac558b1e827d09f707f5ffd2681 (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
Made available in DSpace on 2017-12-15T13:07:46Z (GMT). No. of bitstreams: 2
LangeTFMDS2017.pdf: 552783 bytes, checksum: 93814ac558b1e827d09f707f5ffd2681 (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2016-2017
Randomized numerical linear algebra for generalized linear models with big datasets
info:eu-repo/semantics/masterThesis
THUMBNAIL
LangeTFMDS2017.pdf.jpg
LangeTFMDS2017.pdf.jpg
IM Thumbnail
image/jpeg
10382
http://repositori.upf.edu/bitstream/10230/33513/4/LangeTFMDS2017.pdf.jpg
f7c4d66b02d97443bce7d39a41a469d7
MD5
4
TEXT
LangeTFMDS2017.pdf.txt
LangeTFMDS2017.pdf.txt
Extracted text
text/plain
90992
http://repositori.upf.edu/bitstream/10230/33513/3/LangeTFMDS2017.pdf.txt
d7ffbfe964a8ecfee22421f84d21e65c
MD5
3
CC-LICENSE
license_rdf
license_rdf
application/rdf+xml; charset=utf-8
1232
http://repositori.upf.edu/bitstream/10230/33513/2/license_rdf
b51f25f83cca752633b6ec4c418dbcc7
MD5
2
ORIGINAL
LangeTFMDS2017.pdf
LangeTFMDS2017.pdf
application/pdf
552783
http://repositori.upf.edu/bitstream/10230/33513/1/LangeTFMDS2017.pdf
93814ac558b1e827d09f707f5ffd2681
MD5
1
10230/33513
oai:repositori.upf.edu:10230/33513
2021-12-14 13:00:21.992
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/358432021-12-14T11:59:00Zcom_10230_26995com_10230_16441col_10230_26998
Hao, Kwa Jie
2018-11-23T12:42:12Z
2018-11-23T12:42:12Z
2018
http://hdl.handle.net/10230/35843
Treball fi de màster de: Master's Degree in Data Science. Curs 2017-2018
Directors: Omiros Papaspiliopoulos (UPF) and Giacomo Zanella (Bocconi)
Hierarchical modeling is a practical approach with proven results in modeling real world data. This paper studies Gaussian hierarchical models and methods which exploit the sparse conditional independence structure of such models to conduct scalable and efficient inference using sparse linear algebra methods. The efficiency of such methods is highly dependent on the row and column ordering of the precision matrix to be factorized. The key finding in this paper is that a depth-first permutation guarantees an optimal permutation of the precision matrix for Gaussian hierarchical models such that the fill-in ratio is optimal. This makes the use of permutation algorithms such as AMD or MMD unnecessary once it is known that the model used is a Gaussian hierarchical model, saving on computational efficiency. It was also found that the returned fill-in ratio is also optimal with a lexicographical ordering.
Els models jeràrquics són una aproximació pràctica amb resultats provats en modelatge de dades del món real. Aquest article estudia els models i mètodes jeràrquics gaussians que estudien l’estructura d’independència dispersa condicional d’aquests models per a dur a terme inferència escalable i eficient utilitzant mètodes d’àlgebra linear dispersa. L’eficiència d’aquests mètodes és altament dependent de l’ordenament de files i columnes de la matriu de precisió a factoritzar. La troballa clau d’aquest article és que una permutació de primer grau garanteix tal permutació òptima de la matriu de precisió per a models jeràrquics gaussians que la ràtio d’emplenament és òptima. Això fa innecessari l’ús d’algoritmes de permutació com AMD o MMD, un cop és conegut que el model utilitzat és un model jeràrquic gaussià, estalviant en eficiència computacional. També s’ha trobat que la ràtio d’emplenament és també òptima amb ordenació lexicogràfica.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2018-11-23T12:42:12Z
No. of bitstreams: 2
HaoTFMDS2018.pdf: 364859 bytes, checksum: 0cebdd5b496ce55750614957057a74ae (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
Made available in DSpace on 2018-11-23T12:42:12Z (GMT). No. of bitstreams: 2
HaoTFMDS2018.pdf: 364859 bytes, checksum: 0cebdd5b496ce55750614957057a74ae (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2017-2018
Hierarchical models
Bayesian statistics
Sparse linear algebra
Matrix permutation
Models jeràrquics
Estadística bayesiana
Àlgebra dispersa lineal
Matriu de permutació
Scalable inference for Gaussian hierarchical models
info:eu-repo/semantics/masterThesis
THUMBNAIL
HaoTFMDS2018.pdf.jpg
HaoTFMDS2018.pdf.jpg
IM Thumbnail
image/jpeg
8867
http://repositori.upf.edu/bitstream/10230/35843/4/HaoTFMDS2018.pdf.jpg
d1ff07214d3145fd00d7ab8cf316c83b
MD5
4
TEXT
HaoTFMDS2018.pdf.txt
HaoTFMDS2018.pdf.txt
Extracted text
text/plain
40034
http://repositori.upf.edu/bitstream/10230/35843/3/HaoTFMDS2018.pdf.txt
03a89a9f0c5f2332325a008915c03dcf
MD5
3
CC-LICENSE
license_rdf
license_rdf
application/rdf+xml; charset=utf-8
1232
http://repositori.upf.edu/bitstream/10230/35843/2/license_rdf
b51f25f83cca752633b6ec4c418dbcc7
MD5
2
ORIGINAL
HaoTFMDS2018.pdf
HaoTFMDS2018.pdf
application/pdf
364859
http://repositori.upf.edu/bitstream/10230/35843/1/HaoTFMDS2018.pdf
0cebdd5b496ce55750614957057a74ae
MD5
1
10230/35843
oai:repositori.upf.edu:10230/35843
2021-12-14 12:59:00.016
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/358442021-12-14T11:57:24Zcom_10230_26995com_10230_16441col_10230_26998
Costa, Michele
Marschall, Laurits
Mirsadeghi, Seyed Hamed
Sanctis, Alessandro de
2018-11-23T13:11:27Z
2018-11-23T13:11:27Z
2018
http://hdl.handle.net/10230/35844
Treball fi de màster de: Master's Degree in Data Science. Curs 2018-2019
Directors: Ioannis Arapakis, Carlos Segura Perales
The main goal of our Master Project is to predict intraday stock market movements using two different kinds of input features: financial indicators and sentiments from news and tweets. While the former are part of the common technical analysis of financial econometric models, the extracted sentiment of news articles and tweets from Twitters are also proven to correlate with stock markets movements. Our paper aims at contributing to the existing academic and professional knowledge in two main directions. First, we evaluate three different approaches to extract the sentiment from both social and mass media based on its forecasting power. Second, we deploy a battery of engineered features based on the sentiment, together with the financial indicators, in a machine learning model for a fine-grained minute-level forecasting exercise. In the end, two different classes of models are fitted to test the forecasting power of the combined input features. We estimated a classical ARIMA-model, and an XGBoost-model as machine learning algorithm. We collected data on the companies Apple, JPMorgan Chase, Exxon Mobil, and Boeing.
L’objectiu principal del nostre Projecte de Màster és predir els moviments intradia del mercat de valors utilitzant dos tipus diferents de característiques d’entrada: indicadors financers i sentiments de notícies i piulades. Mentre els primers són part de l’anàlisi tècnica comú dels models economètrics financers, els sentiments extrets d’articles de notícies i piulades de twittaires també tenen una correlació demostrada amb els moviments del mercat de valors. El nostre article vol contribuir al coneixement acadèmic i professional en dues direccions principals. En primer lloc, avaluem tres aproximacions diferents per extreure els sentiments de les xarxes socials i els mitjans de masses basant-se en els seus poders de predicció. En segon lloc, despleguem una bateria de característiques d’enginyeria basades en el sentiment, juntament amb indicadors financers, en un model d’aprenentatge automàtic per a un exercici de predicció desgranat al minut. Finalment, es fan dues classes diferents de models per testejar el poder de predicció de les característiques d’entrada combinades. Hem estimat un model ARIMA clàssic i un model XGBoost com a algoritme d’aprenentatge automàtic. Hem recavat dades de les companyies Apple, JPMorgan Chase, Exxon Mobil i Boeing.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2018-11-23T13:11:27Z
No. of bitstreams: 2
Costa et alTFMDS2018.pdf: 1592414 bytes, checksum: cc041b354dc72c3d41e4b0969d7df204 (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
Made available in DSpace on 2018-11-23T13:11:27Z (GMT). No. of bitstreams: 2
Costa et alTFMDS2018.pdf: 1592414 bytes, checksum: cc041b354dc72c3d41e4b0969d7df204 (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2017-2018
Machine learning
Sentiment analysis
XGBoost
Finance
Aprenentatge automàtic
Anàlisi de sentiments
Finances
Investigation of sentiment importance on intraday stock returns
info:eu-repo/semantics/masterThesis
THUMBNAIL
Costa et alTFMDS2018.pdf.jpg
Costa et alTFMDS2018.pdf.jpg
IM Thumbnail
image/jpeg
9898
http://repositori.upf.edu/bitstream/10230/35844/4/Costa%20et%20alTFMDS2018.pdf.jpg
2314bf3e6b730f81e47ee69b81dd8400
MD5
4
TEXT
Costa et alTFMDS2018.pdf.txt
Costa et alTFMDS2018.pdf.txt
Extracted text
text/plain
53963
http://repositori.upf.edu/bitstream/10230/35844/3/Costa%20et%20alTFMDS2018.pdf.txt
9ea0371518075adb3cab744568bb485a
MD5
3
CC-LICENSE
license_rdf
license_rdf
application/rdf+xml; charset=utf-8
1232
http://repositori.upf.edu/bitstream/10230/35844/2/license_rdf
b51f25f83cca752633b6ec4c418dbcc7
MD5
2
ORIGINAL
Costa et alTFMDS2018.pdf
Costa et alTFMDS2018.pdf
application/pdf
1592414
http://repositori.upf.edu/bitstream/10230/35844/1/Costa%20et%20alTFMDS2018.pdf
cc041b354dc72c3d41e4b0969d7df204
MD5
1
10230/35844
oai:repositori.upf.edu:10230/35844
2021-12-14 12:57:24.167
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/433232021-12-14T11:56:27Zcom_10230_26995com_10230_16441col_10230_26998
Matyja, Monika
Morera, Jordi
Wolf, Sebastian
2020-01-21T15:03:54Z
2020-01-21T15:03:54Z
2019
http://hdl.handle.net/10230/43323
Treball fi de màster de: Master's Degree in Data Science. Curs 2018-2019
Directors: Hrvoje Stojic, Anestis Papanikolaou
In this thesis we develop a traffic light control agent that can manage traffic lights with the objective to reduce traffic jams, trip time and other traffic metrics in a given network using reinforcement learning. To this end, we implement a Double Deep Q-Network algorithm and test its performance in controlling traffic lights on a ’small’ and a ’large’ traffic junction. We find that this algorithm beats a fixed traffic light phase program when traffic demand fluctuates, as it is capable of reacting to real-time traffic situations. The algorithm can be scaled up and holds promise to also perform well in controlling larger transport networks.
En aquest treball de final de màster es desenvolupa un algorisme d'aprenentatge reforçat pel control de semàfors amb l'objectiu de reduir temps de trajecte i retencions. Específicament, s'ha implementat l'algorisme Double Deep Q-Network i s'ha comprovat la seva eficàcia comparant-lo amb escenaris realistes de control d'una intersecció simple i d'una complexa. S'ha demostrat que aquest algorisme es comporta millor que l'escenari real en el qual el canvi de fase es duu a terme amb intervals de temps fixes. Els resultats indiquen que aquesta tècnica és capaç d'adaptar-se a les situacions de trànsit canviants i per tant obtenir millor resultats que l'escenari real. L'algorisme pot ser adaptat per controlar xarxes de trànsit més grans.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2020-01-21T15:03:54Z
No. of bitstreams: 2
2019_TFM_DS_MatyjaDeep.pdf: 439369 bytes, checksum: b81a746885ccbfe414f15868ba9d308b (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
Made available in DSpace on 2020-01-21T15:03:54Z (GMT). No. of bitstreams: 2
2019_TFM_DS_MatyjaDeep.pdf: 439369 bytes, checksum: b81a746885ccbfe414f15868ba9d308b (MD5)
license_rdf: 1232 bytes, checksum: b51f25f83cca752633b6ec4c418dbcc7 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2018-2019
Aprenentatge per reforç
Circulació
Enginyeria del trànsit
Reinforcement learning
Traffic flow
Traffic engineering
Deep reinforcement learning for the optimization of traffic light control with real-time data
info:eu-repo/semantics/masterThesis
THUMBNAIL
2019_TFM_DS_MatyjaDeep.pdf.jpg
2019_TFM_DS_MatyjaDeep.pdf.jpg
IM Thumbnail
image/jpeg
8905
http://repositori.upf.edu/bitstream/10230/43323/4/2019_TFM_DS_MatyjaDeep.pdf.jpg
0fc00ec05a731fa667d6a84cde63cdd7
MD5
4
TEXT
2019_TFM_DS_MatyjaDeep.pdf.txt
2019_TFM_DS_MatyjaDeep.pdf.txt
Extracted text
text/plain
63411
http://repositori.upf.edu/bitstream/10230/43323/3/2019_TFM_DS_MatyjaDeep.pdf.txt
4c72f0449ac240e1a3d63db3612baf81
MD5
3
CC-LICENSE
license_rdf
license_rdf
application/rdf+xml; charset=utf-8
1232
http://repositori.upf.edu/bitstream/10230/43323/2/license_rdf
b51f25f83cca752633b6ec4c418dbcc7
MD5
2
ORIGINAL
2019_TFM_DS_MatyjaDeep.pdf
2019_TFM_DS_MatyjaDeep.pdf
application/pdf
439369
http://repositori.upf.edu/bitstream/10230/43323/1/2019_TFM_DS_MatyjaDeep.pdf
b81a746885ccbfe414f15868ba9d308b
MD5
1
10230/43323
oai:repositori.upf.edu:10230/43323
2021-12-14 12:56:27.712
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/463102021-12-14T11:55:11Zcom_10230_26995com_10230_16441col_10230_26998
Müller, Maximilian
2021-02-02T12:26:19Z
2021-02-02T12:26:19Z
2020-08-17
http://hdl.handle.net/10230/46310
Treball fi de màster de: Master's Degree in Data Science. Curs 2019-2020
Directors: Omiros Papaspiliopoulos and Giacomo Zanella
In order to apply statistical learning in the framework of crossed random effects models it is necessary to efficiently compute the Cholesky factor L of the models precision matrix. In this paper we show that for the case of 2 factors the crucial point to this end is not only the sparsity of L, but also the arrangement of non-zero entries. In particular, we express the number of flops required for the calculation of L by the number of 3-cycles in the corresponding graph. We then introduce specific designs of 2-factor crossed random effects models for which we can prove sparsity and density, respectively. We confirm our results by numerical studies with the R-packages Spam and Matrix and find hints that approximations of the Cholesky factor could be an interesting approach for further decrease of the cost of computing L.
Para aplicar el aprendizaje estadístico en el marco de los modelos de efectos aleatorios cruzados es necesario calcular eficientemente el factor L de Cholesky de la matriz de precisión de los modelos. En este trabajo mostramos que para el caso de dos factores el punto crucial para este fin no es sólo la dispersión de L, sino también la disposición de las entradas no nulas. En particular, expresamos el número de FLOPS necesarios para el cálculo de L por el número de 3-ciclos en el gráfico correspondiente. A continuación, introducimos diseños específicos de modelos de efectos aleatorios cruzados de 2 factores para los que podemos probar la dispersión y la densidad, respectivamente. Confirmamos nuestros resultados mediante estudios numéricos con los paquetes Spam y Matrix de R y encontramos indicios de que las aproximaciones del factor Cholesky podrían ser un enfoque interesante para una mayor disminución del costo del cálculo de L.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2021-02-02T12:26:19Z
No. of bitstreams: 1
TFMBGSE2020DSMueller.pdf: 926730 bytes, checksum: b948e4290ba3daca6b6fb77077f6c952 (MD5)
Made available in DSpace on 2021-02-02T12:26:19Z (GMT). No. of bitstreams: 1
TFMBGSE2020DSMueller.pdf: 926730 bytes, checksum: b948e4290ba3daca6b6fb77077f6c952 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2019-2020
Àlgebra lineal
Factorització (Matemàtica)
Estadística bayesiana
Crossed random effects models
Cholesky factorization
Sparse linear algebra
Bayesian statistics
Scalable inference for crossed random effects models
info:eu-repo/semantics/masterThesis
THUMBNAIL
TFMBGSE2020DSMueller.pdf.jpg
TFMBGSE2020DSMueller.pdf.jpg
IM Thumbnail
image/jpeg
9109
http://repositori.upf.edu/bitstream/10230/46310/3/TFMBGSE2020DSMueller.pdf.jpg
882a5acb199c54757a08b5061ffa108b
MD5
3
TEXT
TFMBGSE2020DSMueller.pdf.txt
TFMBGSE2020DSMueller.pdf.txt
Extracted text
text/plain
75895
http://repositori.upf.edu/bitstream/10230/46310/2/TFMBGSE2020DSMueller.pdf.txt
ee2a425da8a5168e036321d9fd1930ab
MD5
2
ORIGINAL
TFMBGSE2020DSMueller.pdf
TFMBGSE2020DSMueller.pdf
application/pdf
926730
http://repositori.upf.edu/bitstream/10230/46310/1/TFMBGSE2020DSMueller.pdf
b948e4290ba3daca6b6fb77077f6c952
MD5
1
10230/46310
oai:repositori.upf.edu:10230/46310
2021-12-14 12:55:11.717
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/463112021-12-14T11:53:50Zcom_10230_26995com_10230_16441col_10230_26998
Pap, Aron
2021-02-02T13:04:06Z
2021-02-02T13:04:06Z
2020-06-25
http://hdl.handle.net/10230/46311
Treball fi de màster de: Master's Degree in Data Science. Curs 2019-2020
Directors: Omar A. Guerrero and Joan de Martí
In this thesis project I analyse labour flow networks and company control networks in the UK. I observe that these networks exhibit characteristics that are typical of empirical networks, such as heavy-tailed degree distribution, strong communities with geo-industrial clustering and high assortativity. I document that distinguishing between the type of investors of firms can help to understand their degree centrality in the company control network and that large institutional entities having significant and exclusive control in a firm seem to be responsible for emerging hubs in this network. I also devise a simple network formation model to study the underlying causal processes in this company control network. I perform numerical simulations and model parameter calibration, obtaining a model that captures the empirically observed patterns in the data.
En este proyecto de tesis analizo las redes de flujo de trabajo y las redes de control de laempresa en el Reino Unido. Observo que estas redes exhiben características que son típicas de las redes empíricas, como la distribución de grados de cola gruesa, comunidades fuertes con agrupamiento geoindustrial y alta surtividad. Documento que distinguir entre el tipo de inversores de las empresas puede ayudar a comprender su grado de centralidad en la red de control de la empresa y que las grandes entidades
institucionales que tienen un control significativo y exclusivo en una empresa parecen ser responsables de los centros emergentes en esta red. También ideo un modelo de formación de red simple para estudiar los procesos causales subyacentes en la red de control de esta empresa. Realizo simulaciones numéricas y calibración de parámetros del modelo, obteniendo un modelo que captura los patrones observados empíricamente en los datos.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2021-02-02T13:04:06Z
No. of bitstreams: 1
TFMBGSE2020DSPap.pdf: 7390337 bytes, checksum: d6bbc85ce69183b49ab730d1e9f95ced (MD5)
Made available in DSpace on 2021-02-02T13:04:06Z (GMT). No. of bitstreams: 1
TFMBGSE2020DSPap.pdf: 7390337 bytes, checksum: d6bbc85ce69183b49ab730d1e9f95ced (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2019-2020
Mobilitat laboral
Networks
Labour mobility
Company control
Structure and power dynamics in economic networks : a quantitative analysis of labour flow and company control networks in the UK
info:eu-repo/semantics/masterThesis
THUMBNAIL
TFMBGSE2020DSPap.pdf.jpg
TFMBGSE2020DSPap.pdf.jpg
IM Thumbnail
image/jpeg
9623
http://repositori.upf.edu/bitstream/10230/46311/3/TFMBGSE2020DSPap.pdf.jpg
1ddc70559e717faa9e93d1aa48ab7f55
MD5
3
TEXT
TFMBGSE2020DSPap.pdf.txt
TFMBGSE2020DSPap.pdf.txt
Extracted text
text/plain
112782
http://repositori.upf.edu/bitstream/10230/46311/2/TFMBGSE2020DSPap.pdf.txt
9bb1d789e59456c97667d56ca30321e4
MD5
2
ORIGINAL
TFMBGSE2020DSPap.pdf
TFMBGSE2020DSPap.pdf
application/pdf
7390337
http://repositori.upf.edu/bitstream/10230/46311/1/TFMBGSE2020DSPap.pdf
d6bbc85ce69183b49ab730d1e9f95ced
MD5
1
10230/46311
oai:repositori.upf.edu:10230/46311
2021-12-14 12:53:50.998
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/463142021-12-14T11:53:07Zcom_10230_26995com_10230_16441col_10230_26998
Battaglia, Laura
Salunina, Maria
2021-02-02T14:22:50Z
2021-02-02T14:22:50Z
2020-07-20
http://hdl.handle.net/10230/46314
Treball fi de màster de: Master's Degree in Data Science. Curs 2019-2020
Director: Omiros Papaspiliopoulos
In this study, we propose an approach for the extraction of a low-dimensional signal from a collection of text documents ordered over time. The proposed framework foresees the application of Latent Dirichlet Allocation (LDA) for obtaining a meaningful representation of documents as a mixture over a set of topics. Such representations can then be modeled via a Dynamic Linear Model (DLM) as noisy realisations of a limited number of latent factors that evolve with time. We apply this approach to Federal Open Market Committee (FOMC) speech transcripts for the period of Greenspan presidency. We are able to extract a latent factor that fairly resembles the Economic Policy Uncertainty Index for United States.
En este trabajo proponemos una metodología para la extracción de señales de baja dimensionalidad en una colección de textos ordenados temporalmente. El enfoque propuesto prevé la aplicación de la asignación latente de Dirichlet - Latent Dirichlet Allocation (LDA) - para obtener una representación de los documentos como una mezcla de diversos temas. Dichas representaciones se pueden modelar a través de un modelo lineal dinámico - Dynamic Linear Model (DLM) - como realizaciones ruidosas de un número limitado de factores latentes que evolucionan en el tiempo. Aplicamos este enfoque a las transcripciones de los pronunciamientos del Comité Federal de Mercado Abierto para el período de la presidencia de Alan Greenspan. Utilizando este modelo podemos extraer un factor latente que se asemeja al índice de incertidumbre de política económica de Estados Unidos.
Submitted by Montserrat FERNÁNDEZ TEIXIDÓ (montserrat.fernandez@upf.edu) on 2021-02-02T14:22:50Z
No. of bitstreams: 1
TFMBGSE2020DSBattagliaSalunina.pdf: 2915341 bytes, checksum: 62a035a78cc4621e3207f452e0434beb (MD5)
Made available in DSpace on 2021-02-02T14:22:50Z (GMT). No. of bitstreams: 1
TFMBGSE2020DSBattagliaSalunina.pdf: 2915341 bytes, checksum: 62a035a78cc4621e3207f452e0434beb (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2019-2020
Signal extraction
Topic model
Dynamic linear model
Federal Open Market Committee (FOMC)
Extracció (Lingüística)
Tracking the economy using FOMC speech transcripts
info:eu-repo/semantics/masterThesis
THUMBNAIL
TFMBGSE2020DSBattagliaSalunina.pdf.jpg
TFMBGSE2020DSBattagliaSalunina.pdf.jpg
IM Thumbnail
image/jpeg
8550
http://repositori.upf.edu/bitstream/10230/46314/3/TFMBGSE2020DSBattagliaSalunina.pdf.jpg
38d80e6275ec2cbccd64b0ff94541329
MD5
3
TEXT
TFMBGSE2020DSBattagliaSalunina.pdf.txt
TFMBGSE2020DSBattagliaSalunina.pdf.txt
Extracted text
text/plain
53729
http://repositori.upf.edu/bitstream/10230/46314/2/TFMBGSE2020DSBattagliaSalunina.pdf.txt
0442ca02abcbf3419e8deac4ca19a484
MD5
2
ORIGINAL
TFMBGSE2020DSBattagliaSalunina.pdf
TFMBGSE2020DSBattagliaSalunina.pdf
application/pdf
2915341
http://repositori.upf.edu/bitstream/10230/46314/1/TFMBGSE2020DSBattagliaSalunina.pdf
62a035a78cc4621e3207f452e0434beb
MD5
1
10230/46314
oai:repositori.upf.edu:10230/46314
2021-12-14 12:53:07.686
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/492032021-12-15T02:31:51Zcom_10230_26995com_10230_16441col_10230_26998
Gimenez Funes, Eduard
2021-12-14T12:36:33Z
2021-12-14T12:36:33Z
2021
http://hdl.handle.net/10230/49203
Treball fi de màster de: Master's Degree in Data Science. Curs 2020-2021
Directors: Vicenç Gómez (UPF) , Carlos Segura and Ferran Diego (Telefónica Research)
Normalizing flows are an elegant approximation to generative modelling. It can be shown that learning a probability distribution of a continuous variable X is equivalent to learning a mapping f from the domain where X is defined to Rn is such that the final distribution is a Gaussian. In “Glow: Generative flow with invertible 1x1 convolutions” Kingma et al introduced the Glow model. Normalizing flows arrange the latent space in such a way that feature additivity is possible, allowing synthetic image generation. For example, it is possible to take the image of a person not smiling, add a smile, and obtain the image of the same person smiling. Using the CelebA dataset we report new experimental properties of the latent space such as specular images and linear discrimination. Finally, we propose a mathematical framework that helps to understand why feature additivity works.
Normalizing flows es una elegante aproximación al modelado generativo. Se puede demostrar que aprender una distribución de probabilidad de una variable continua X es equivalente a aprender un mapeo f del dominio donde X se define a Rn de forma que la densidad resultante sea una Gaussiana. En " Glow: Generative flow with invertible 1x1 convolutions", Kingma et al introdujeron el modelo Glow. Los flujos de normalización organizan el espacio latente de tal manera que es posible la adición de características, lo que permite la generación de imágenes sintéticas. Por ejemplo, es posible tomar la imagen de una persona que no sonríe, agregar una sonrisa y obtener la imagen de la misma persona sonriendo. Utilizando el conjunto de datos de CelebA encontramos nuevas propiedades experimentales del espacio latente, como imágenes especulares y discriminación lineal. Finalmente, proponemos un modelo matemático que ayuda a comprender por qué funciona la aditividad de características.
Submitted by MONTSERRAT FERNANDEZ TEIXIDO (montserrat.fernandez@upf.edu) on 2021-12-14T12:36:33Z
No. of bitstreams: 1
TFM2021BGSEGimenezUnder.pdf: 17144753 bytes, checksum: 1744ebccec2d914ceed273358d317e14 (MD5)
Made available in DSpace on 2021-12-14T12:36:33Z (GMT). No. of bitstreams: 1
TFM2021BGSEGimenezUnder.pdf: 17144753 bytes, checksum: 1744ebccec2d914ceed273358d317e14 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2020-2021
Generative models
Neural networks
Deep learning
Latent space
Modelos generativos
Redes neuronales
Aprendizaje profundo
Espacio latente
Understanding latent vector arithmetic for attribute manipulation in normalizing flows
info:eu-repo/semantics/masterThesis
THUMBNAIL
TFM2021BGSEGimenezUnder.pdf.jpg
TFM2021BGSEGimenezUnder.pdf.jpg
IM Thumbnail
image/jpeg
9430
http://repositori.upf.edu/bitstream/10230/49203/3/TFM2021BGSEGimenezUnder.pdf.jpg
034901d6f43069d2ce147e1783d1a1fc
MD5
3
TEXT
TFM2021BGSEGimenezUnder.pdf.txt
TFM2021BGSEGimenezUnder.pdf.txt
Extracted text
text/plain
47203
http://repositori.upf.edu/bitstream/10230/49203/2/TFM2021BGSEGimenezUnder.pdf.txt
507a3e47263408b8d65b7515899de384
MD5
2
ORIGINAL
TFM2021BGSEGimenezUnder.pdf
TFM2021BGSEGimenezUnder.pdf
application/pdf
17144753
http://repositori.upf.edu/bitstream/10230/49203/1/TFM2021BGSEGimenezUnder.pdf
1744ebccec2d914ceed273358d317e14
MD5
1
10230/49203
oai:repositori.upf.edu:10230/49203
2021-12-15 03:31:51.299
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/492042021-12-16T02:31:42Zcom_10230_26995com_10230_16441col_10230_26998
Agustí, Marc
Altmeyer, Patrick
Vidal-Quadras, Ignacio
2021-12-14T12:48:52Z
2021-12-14T12:48:52Z
2021-07-20
http://hdl.handle.net/10230/49204
Treball fi de màster de: Master's Degree in Data Science. Curs 2020-2021
Director: Christian Brownlees
Vector autoregression (VAR) models are a popular choice for forecasting of macroeconomic time series data. Due to their simplicity and success at modelling the monetary economic indicators VARs have become a standard tool for central bankers to construct economic forecasts. In light of the recent advancements in computational power and the development of advanced machine learning and deep learning algorithms we propose a simple way to integrate these tools into the VAR framework. This paper aims to contribute to the time series literature by introducing a ground-breaking methodology which we refer to as Deep Vector Autoregression (Deep VAR). By fitting each equation of the VAR system with a deep neural network, the Deep VAR outperforms the VAR in terms of in-sample fit, out-of-sample fit and point forecasting accuracy. In particular, we find that the Deep VAR is able to better capture the structural economic changes during periods of uncertainty and recession.
Los modelos de auto-regresión vectorial (VAR) son una opción popular para pronosticar datos de series de tiempo macroeconómicas. Debido a su simplicidad y éxito al modelar los indicadores económicos monetarios, los VAR se han convertido en una herramienta estándar para que los bancos centrales construyan pronósticos económicos. En base a los avances recientes en el poder computacional y el desarrollo de algoritmos avanzados de Machine Learning y Deep Learning, proponemos una forma sencilla de integrar estas herramientas en el marco relativo al VAR. Este paper tiene como objetivo contribuir a la literatura de series de tiempo mediante la introducción de una metodología innovadora a la que nos referimos como Deep VAR. Al ajustar cada ecuación del sistema VAR con una red neuronal profunda, Deep VAR supera al VAR en términos de ajuste dentro de la muestra, ajuste fuera de la muestra y precisión de pronóstico de puntos. En particular, encontramos que el Deep VAR puede capturar mejor los cambios económicos estructurales durante períodos de incertidumbre y recesión.
Submitted by MONTSERRAT FERNANDEZ TEIXIDO (montserrat.fernandez@upf.edu) on 2021-12-14T12:48:52Z
No. of bitstreams: 1
TFM2021BGSEAgustiDeep.pdf: 6142712 bytes, checksum: dbfd68ae087473a4225f9cbe5d45b459 (MD5)
Made available in DSpace on 2021-12-14T12:48:52Z (GMT). No. of bitstreams: 1
TFM2021BGSEAgustiDeep.pdf: 6142712 bytes, checksum: dbfd68ae087473a4225f9cbe5d45b459 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2020-2021
Deep vector autoregression
Macroeconomic forecasting
Neural networks
Autoregresores vectoriales profundos
Predicción macroeconómica
Redes neuronales
Deep vector autoregression for macroeconomic data
info:eu-repo/semantics/masterThesis
THUMBNAIL
TFM2021BGSEAgustiDeep.pdf.jpg
TFM2021BGSEAgustiDeep.pdf.jpg
IM Thumbnail
image/jpeg
8950
http://repositori.upf.edu/bitstream/10230/49204/2/TFM2021BGSEAgustiDeep.pdf.jpg
695d5ef3e5c8d3ccf6a9a9b63e30fe40
MD5
2
TEXT
TFM2021BGSEAgustiDeep.pdf.txt
TFM2021BGSEAgustiDeep.pdf.txt
Extracted text
text/plain
66471
http://repositori.upf.edu/bitstream/10230/49204/3/TFM2021BGSEAgustiDeep.pdf.txt
03a67eb61f3f6ea46e17d29237be65e7
MD5
3
ORIGINAL
TFM2021BGSEAgustiDeep.pdf
TFM2021BGSEAgustiDeep.pdf
application/pdf
6142712
http://repositori.upf.edu/bitstream/10230/49204/1/TFM2021BGSEAgustiDeep.pdf
dbfd68ae087473a4225f9cbe5d45b459
MD5
1
10230/49204
oai:repositori.upf.edu:10230/49204
2021-12-16 03:31:42.582
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/565692023-04-27T01:30:21Zcom_10230_26995com_10230_16441col_10230_26998
Ampudia, David
Leung, Clinton
2023-04-26T11:05:26Z
2023-04-26T11:05:26Z
2022-05
http://hdl.handle.net/10230/56569
Treball fi de màster de: Master's Degree in Data Science. Methodology Program. Curs 2021-2022
Tutor: Hrvoje Stojic
Bayesian optimization has emerged as an effective and efficient approach for finding the global optimum of highly complex derivative-free black-box functions. It typically models the objective function with Gaussian processes (GP) as a surrogate. Based on this surrogate, an auxiliary acquisition function proposes candidate optima locations to query the objective function at. In this paper, we explore recent developments that may help alleviate two key limitations of GP’s: poor performance with large datasets, and non-stationary target functions. To this end, we propose and implement several scalable uncertainty aware neural networks as alternative surrogates. In a series of tests, we showcase the relative performance of ensembles, Bayesian, and direct estimation neural network approaches against that of traditional GP’s and state of the art Sparse Variational Gaussian Processes (SVGP) in Bayesian optimization settings. Our results show that not only are neural networks a scalable solution with comparable performance to GP’s, but they also hold the potential to outperform SVGP’s.
La optimització bayesiana ha aparegut com una alternativa eficaç i eficient per trobar l’òptim global de funcions sense derivades i molt complexes. Normalment, la optimització bayesiana modela la funció objectiu amb processos gaussians (PG) com a substitut. En funció d’aquest substitut, una funció d’adquisició auxiliar proposa possibles ubicacions òptimes per consultar la funció objectiu. En aquest article, explorem els desenvolupaments recents que poden ajudar a al·leviar dues limitacions clau dels PG: rendiment baix quan disposem de grans quantitats de dades, i funcions objectiu no estacionàries. Amb aquest fi, proposem i implementem una sèrie d’alternatives en forma de xarxes neuronals amb capacitat de quantificar l’incertesa del model i que permeten treballar amb dades de major dimensionalitat. En una sèrie de proves i escenaris d’optimització bayesiana, mostrem el rendiment de models neurals de conjunts (ensembles), bayesians i d’estimació directa de l’incertesa i els comparem amb els PG tradicionals i amb els processos gaussians variacionals dispersos (PGVD) de darrera generació. Els nostres resultats mostren que les xarxes neuronals implementades no només són una solució escalable amb un rendiment comparable als PG, sinó que també tenen el potencial de superar als PGVD.
La optimización bayesiana ha surgido como una alternativa eficaz y eficiente para encontrar el óptimo global de funciones sin derivadas y altamente complejas. Por lo general, la optimización bayesiana modela la función objetivo con procesos gaussianos (PG) como sustituto. En función de este sustituto, una función de adquisición auxiliar propone posibles ubicaciones óptimas para consultar la función objetivo. En este artículo, exploramos los desarrollos recientes que pueden ayudar a aliviar dos limitaciones clave de los PG: bajo rendimiento cuando contamos con grandes cantidades de datos, y funciones objetivo no estacionarias. Con este fin, proponemos e implementamos una serie de alternatives en forma de redes neuronales capaces de cuantificar la incertidumbre del modelo y que permiten trabajar con datos de mayor dimensionalidad. En una serie de pruebas y escenarios de optimización bayesiana, mostramos el rendimiento de modelos neurales de conjuntos (ensembles), bayesianos y de estimación directa de la incertidumbre y los comparamos a los PG tradicionales y a los procesos gaussianos variacionales dispersos (PGVD) de última generación. Nuestros resultados muestran que las redes neuronales implementadas no solo son una solución escalable con un rendimiento comparable al de los PG, sino que también tienen el potencial de superar a los PGVD.
Submitted by MONTSERRAT FERNANDEZ TEIXIDO (montserrat.fernandez@upf.edu) on 2023-04-26T11:05:26Z
No. of bitstreams: 1
TFM22Ampudia_LeungBSE_DS.pdf: 4669962 bytes, checksum: 2e5f9f808f2626bd1f9e5516cb59ce02 (MD5)
Made available in DSpace on 2023-04-26T11:05:26Z (GMT). No. of bitstreams: 1
TFM22Ampudia_LeungBSE_DS.pdf: 4669962 bytes, checksum: 2e5f9f808f2626bd1f9e5516cb59ce02 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2021-2022
Bayesian optimization
Gaussian processes
Surrogate models
Uncertainty aware neural networks
Ensembles
Direct estimation neural networks
Sparse Variational Gaussian Processes (SVGP)
Optimización bayesiana
Procesos gaussianos
Modelos sustitutos
Redes neuronales capaces de cuantificar la incertidumbre
Conjuntos
Estimación directa de la incertidumbre
Procesos gaussianos variacionales dispersos (PGVD)
Optimització bayesiana
Processos gaussians
Models substituts
Xarxes neuronals amb capacitat de quantificar l’incertesa
Conjunts
Estimació directa de l’incertesa
Processos gaussians variacionals dispersos (PGVD)
Bayesian optimization with uncertainty aware neural networks
info:eu-repo/semantics/masterThesis
THUMBNAIL
TFM22Ampudia_LeungBSE_DS.pdf.jpg
TFM22Ampudia_LeungBSE_DS.pdf.jpg
IM Thumbnail
image/jpeg
11242
http://repositori.upf.edu/bitstream/10230/56569/3/TFM22Ampudia_LeungBSE_DS.pdf.jpg
8102f716f97a2f7ac742a9bb32a56f42
MD5
3
TEXT
TFM22Ampudia_LeungBSE_DS.pdf.txt
TFM22Ampudia_LeungBSE_DS.pdf.txt
Extracted text
text/plain
66107
http://repositori.upf.edu/bitstream/10230/56569/2/TFM22Ampudia_LeungBSE_DS.pdf.txt
09360c03cb439224c3ebd10caf3c4ec8
MD5
2
ORIGINAL
TFM22Ampudia_LeungBSE_DS.pdf
TFM22Ampudia_LeungBSE_DS.pdf
application/pdf
4669962
http://repositori.upf.edu/bitstream/10230/56569/1/TFM22Ampudia_LeungBSE_DS.pdf
2e5f9f808f2626bd1f9e5516cb59ce02
MD5
1
10230/56569
oai:repositori.upf.edu:10230/56569
2023-04-27 03:30:21.702
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/565702023-04-27T01:30:19Zcom_10230_26995com_10230_16441col_10230_26998
Aguilar, Iván
Jones, Rebecca
Lovicu, Gian-Piero
2023-04-26T11:40:41Z
2023-04-26T11:40:41Z
2022-05
http://hdl.handle.net/10230/56570
Treball fi de màster de: Master's Degree in Data Science. Methodology Program. Curs 2021-2022
Tutor: Christian Brownlees
Financial institutions are beginning to integrate cryptocurrencies into their payment systems but must ensure to comply with anti-money laundering regulations to avoid facilitating transactions linked to criminal activities. We propose a cryptocurrency risk detection model that could be used by these institutions. It is novel in two ways: firstly, it prioritises a high recall, and secondly, organises the transaction data in a different 'address-level' manner. We test different Graph Neural Network (GNN) models and find that the Graph Attention Network using our address-level data achieves a recall of 83%, an improvement on results achieved in previous literature.
Les entitats financeres comencen a integrar criptomonedes als seus sistemes de pagament, però han de garantir el compliment de la normativa contra el blanqueig de capitals per evitar facilitar les transaccions vinculades a activitats delictives. Proposem un model de detecció de risc de criptomoneda que podrien utilitzar aquestes institucions. És nou de dues maneres: en primer lloc, prioritza un record elevat i també organitza les dades de la transacció d'una manera diferent de "nivell d'adreça". Provem diferents models de xarxa neuronal de gràfics (GNN) i trobem que la xarxa d'atenció gràfica utilitzant les nostres dades a nivell d'adreça aconsegueix un record del 83%, una millora dels resultats assolits en la literatura anterior.
Las instituciones financieras están comenzando a integrar las criptomonedas en sus sistemas de pago, pero deben asegurarse de cumplir con las normas contra el lavado de dinero para evitar facilitar transacciones vinculadas a actividades delictivas. Proponemos un modelo de detección de riesgo en criptomonedas que podría ser utilizado por estas instituciones. Es novedoso de dos maneras: en primer lugar, prioriza un recall alto y también organiza los datos de la transacción en un formato a nivel ‘address’. Probamos diferentes modelos de Graph Neural Network (GNN) y encontramos que Graph Attention Network usando nuestros datos de nivel address logra un recall del 83%, una mejora en los resultados logrados en la literatura anterior.
Submitted by MONTSERRAT FERNANDEZ TEIXIDO (montserrat.fernandez@upf.edu) on 2023-04-26T11:40:41Z
No. of bitstreams: 1
TFM22Aguilar_Jones_Lovicu DSM.pdf: 1956053 bytes, checksum: 78970743b8d58927fe743b34780606d2 (MD5)
Made available in DSpace on 2023-04-26T11:40:41Z (GMT). No. of bitstreams: 1
TFM22Aguilar_Jones_Lovicu DSM.pdf: 1956053 bytes, checksum: 78970743b8d58927fe743b34780606d2 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2021-2022
GNN
GAT
Cryptocurrency
Criptomoneda
Risk detection in cryptocurrency markets: meeting the needs of traditional finance
info:eu-repo/semantics/masterThesis
THUMBNAIL
TFM22Aguilar_Jones_Lovicu DSM.pdf.jpg
TFM22Aguilar_Jones_Lovicu DSM.pdf.jpg
IM Thumbnail
image/jpeg
11766
http://repositori.upf.edu/bitstream/10230/56570/3/TFM22Aguilar_Jones_Lovicu%20DSM.pdf.jpg
20b1b8d712a00ed58fd88edf17d2b53c
MD5
3
TEXT
TFM22Aguilar_Jones_Lovicu DSM.pdf.txt
TFM22Aguilar_Jones_Lovicu DSM.pdf.txt
Extracted text
text/plain
91181
http://repositori.upf.edu/bitstream/10230/56570/2/TFM22Aguilar_Jones_Lovicu%20DSM.pdf.txt
ac9d95473b81c49a97023ceacc31055e
MD5
2
ORIGINAL
TFM22Aguilar_Jones_Lovicu DSM.pdf
TFM22Aguilar_Jones_Lovicu DSM.pdf
application/pdf
1956053
http://repositori.upf.edu/bitstream/10230/56570/1/TFM22Aguilar_Jones_Lovicu%20DSM.pdf
78970743b8d58927fe743b34780606d2
MD5
1
10230/56570
oai:repositori.upf.edu:10230/56570
2023-04-27 03:30:19.799
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/565722023-04-27T01:30:13Zcom_10230_26995com_10230_16441col_10230_26998
Couble, Andrés
Schindler, Mathias
Stassinos, Kalliope
2023-04-26T15:51:10Z
2023-04-26T15:51:10Z
2022-06
http://hdl.handle.net/10230/56572
Treball fi de màster de: Master's Degree in Data Science: Master Program in Data Science for Decision Making. Curs 2021-2022
Tutors: Jesús Cerquides; Hannes Mueller
This thesis presents a general-purpose corpus construction methodology with Twitter data for a given political topic in a given country. It applies the methodology to immigration in Chile from November 2021 to April 2022, resulting in a corpus with 573,999 tweets. Our results indicate increasing antiimmigration views from Chilean Twitter users. Right-leaning users are more active and more anti-immigration. Left-leaning users are mostly concerned with xenophobia and racism. Utilizing network analysis methods, we find that right-leaning users are also more influential and interconnected. The results are consistent with previous studies and the methodology is robust to other political topics such as feminism.
Esta tesis presenta una metodología de construcción de corpus con datos de Twitter para un tema político dado en un país dado. Aplicamos la metodología al tema de inmigración en Chile desde noviembre de 2021 hasta abril de 2022, resultando en un corpus con 573.999 tuits. Nuestros resultados indican un aumento de las opiniones contra la inmigración de los usuarios chilenos de Twitter. Los usuarios de derecha son más activos y antinmigración. Los usuarios de tendencia izquierdista se preocupan principalmente por la xenofobia y el racismo. Utilizando métodos de análisis de red, encontramos que los usuarios de derecha también son más influyentes e interconectados. Los resultados son consistentes con estudios previos y la metodología fue testeada para otros temas políticos de interés como el feminismo.
Submitted by MONTSERRAT FERNANDEZ TEIXIDO (montserrat.fernandez@upf.edu) on 2023-04-26T15:51:10Z
No. of bitstreams: 1
TFM22Couble_Schindler_Stassinos_BSE_DS.pdf: 10008388 bytes, checksum: 841f3877b27e5a51eb79a715f1137855 (MD5)
Made available in DSpace on 2023-04-26T15:51:10Z (GMT). No. of bitstreams: 1
TFM22Couble_Schindler_Stassinos_BSE_DS.pdf: 10008388 bytes, checksum: 841f3877b27e5a51eb79a715f1137855 (MD5)
application/pdf
eng
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2021-2022
Twitter
Immigration
Corpus construction
Inmigración
Construcción de corpus
Corpus construction and social media analysis about immigration in Chile
info:eu-repo/semantics/masterThesis
THUMBNAIL
TFM22Couble_Schindler_Stassinos_BSE_DS.pdf.jpg
TFM22Couble_Schindler_Stassinos_BSE_DS.pdf.jpg
IM Thumbnail
image/jpeg
12386
http://repositori.upf.edu/bitstream/10230/56572/3/TFM22Couble_Schindler_Stassinos_BSE_DS.pdf.jpg
964f4d393c8853d1a7f12410ac7f5bab
MD5
3
TEXT
TFM22Couble_Schindler_Stassinos_BSE_DS.pdf.txt
TFM22Couble_Schindler_Stassinos_BSE_DS.pdf.txt
Extracted text
text/plain
152888
http://repositori.upf.edu/bitstream/10230/56572/2/TFM22Couble_Schindler_Stassinos_BSE_DS.pdf.txt
ad51fc661de2064f55fae5fe48bc7756
MD5
2
ORIGINAL
TFM22Couble_Schindler_Stassinos_BSE_DS.pdf
TFM22Couble_Schindler_Stassinos_BSE_DS.pdf
application/pdf
10008388
http://repositori.upf.edu/bitstream/10230/56572/1/TFM22Couble_Schindler_Stassinos_BSE_DS.pdf
841f3877b27e5a51eb79a715f1137855
MD5
1
10230/56572
oai:repositori.upf.edu:10230/56572
2023-04-27 03:30:13.691
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/583562023-11-23T10:25:39Zcom_10230_26995com_10230_16441col_10230_26998
Conner Bonmatí, Miguel
Talvi Robledo, Ramón
Wielath, Dominik Johannes
2023-11-22T15:40:33Z
2023-11-22T15:40:33Z
2023-06
http://hdl.handle.net/10230/58356
Treball fi de màster de: Master's Degree in Data Science: Data Science Methodology Program. Curs 2022-2023
Tutor: Hannes Mueller
We attempt to build a road quality classifier to detect bad roads using satellite imagery in the province of Sud-Kivu in the Democratic Republic of the Congo (DRC). Using 60 cm/pixel resolution from Google Earth, paired with 100 m IRI road quality data for Liberia, we train a CNN (EfficientNetV2) that performs with an accuracy of 47% for 5- classes and 80% for 2-classes (AUC: 0.75). We then establish a connection between the model trained in Liberia and road quality in the DRC. We find that our methods seem to work well given the many limitations of the project.
Construimos un clasificador para evaluar la calidad vial mediante imágenes satelitales en la provincia de Sud-Kivu en la República Democrática del Congo (RDC). Empleando imágenes de Google Earth con resolución de 60 cm/pixel y datos de calidad vial IRI cada 100 m para Liberia, entrenamos una CNN (EfficientNetV2) que alcanza una precisión del 47% en clasificaciones de cinco clases y del 80% en dos clases (AUC: 0.75). Establecimos una conexión entre el modelo entrenado en Liberia y la calidad de carreteras en la RDC. Encontramos que nuestros métodos exhiben una eficacia prometedora a pesar de las limitaciones del proyecto.
Submitted by MARIA PORT PUIG (maria.port@upf.edu) on 2023-11-22T15:40:33Z
No. of bitstreams: 1
BSETFM23_ConTalvWie.pdf: 15546576 bytes, checksum: 7d487b72386486e144c05c1c92714c90 (MD5)
Made available in DSpace on 2023-11-22T15:40:33Z (GMT). No. of bitstreams: 1
BSETFM23_ConTalvWie.pdf: 15546576 bytes, checksum: 7d487b72386486e144c05c1c92714c90 (MD5)
application/pdf
eng
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
https://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2022-2023
Satellite imagery
Computer vision
Road quality
Imágenes satelitales
Visión de computador
Calidad vial
Leveraging satellite imagery to assess road quality in the Democratic Republic of the Congo
info:eu-repo/semantics/masterThesis
THUMBNAIL
BSETFM23_ConTalvWie.pdf.jpg
BSETFM23_ConTalvWie.pdf.jpg
IM Thumbnail
image/jpeg
15336
http://repositori.upf.edu/bitstream/10230/58356/3/BSETFM23_ConTalvWie.pdf.jpg
c0aca65be3fb68e5e0aa522da2fa2b2d
MD5
3
TEXT
BSETFM23_ConTalvWie.pdf.txt
BSETFM23_ConTalvWie.pdf.txt
Extracted text
text/plain
58090
http://repositori.upf.edu/bitstream/10230/58356/2/BSETFM23_ConTalvWie.pdf.txt
90beffad647da62c3b90687bff263b98
MD5
2
ORIGINAL
BSETFM23_ConTalvWie.pdf
BSETFM23_ConTalvWie.pdf
application/pdf
15546576
http://repositori.upf.edu/bitstream/10230/58356/1/BSETFM23_ConTalvWie.pdf
7d487b72386486e144c05c1c92714c90
MD5
1
10230/58356
oai:repositori.upf.edu:10230/58356
2023-11-23 11:25:39.599
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/583582023-11-23T13:05:22Zcom_10230_26995com_10230_16441col_10230_26998
Odizzio, Catalina
Pissinis, Agostina
2023-11-22T16:45:58Z
2023-11-22T16:45:58Z
2023-07
http://hdl.handle.net/10230/58358
Treball fi de màster de: Master's Degree in Data Science: Data Science for Decision Making Program. Curs 2022-2023
Tutors: Hannes Mueller i Jesús Cerquides
This study delves into understanding and predicting user engagement in Enhance VR, a virtual reality cognitive training application, through data-driven approaches. The dataset encompasses de-identified user data including demographic characteristics, mood and session related variables. Initial data
exploration involves descriptive statistics, data visualization, and inferential statistics, assessing correlations between attributes and their effects on engagement and performance. Machine learning models including Random Forests and Gradient Boosting are developed to predict user engagement levels. K-Prototypes clustering is employed for segmentation, identifying distinct user groups based on behavioral and demographic attributes. This research informs the strategic design and content delivery of Enhance VR by identifying distinct user groups and predicting engagement patterns.
Este estudio profundiza en la comprensión y predicción del compromiso del usuario en Enhance VR, una aplicación de entrenamiento cognitivo de realidad virtual, a través de un enfoque basado en datos. El conjunto de datos abarca usuarios no identificados, incluyendo características demográficas, de ánimo y relacionadas con sesiones. La exploración inicial de datos comprende estadísticas descriptivas, visualizaciones y estadísticas inferenciales, evaluando correlaciones entre atributos y sus efectos en el compromiso y rendimiento. Se desarrollan modelos de aprendizaje automático, Random Forest y Gradient Boosting entre otros, para predecir el nivel de compromiso del usuario. Empleamos K-Prototypes para la segmentación, identificando grupos distintos de usuarios basados en atributos conductuales y demográficos. Esta investigación informa el diseño estratégico y la entrega de contenido de Enhance VR al identificar distintos grupos de usuarios y predecir patrones de compromiso.
Submitted by MARIA PORT PUIG (maria.port@upf.edu) on 2023-11-22T16:45:58Z
No. of bitstreams: 1
BSETFM23_OdiPiss.pdf: 1192363 bytes, checksum: decae1714df7e5372780a0a5f6282846 (MD5)
Made available in DSpace on 2023-11-22T16:45:58Z (GMT). No. of bitstreams: 1
BSETFM23_OdiPiss.pdf: 1192363 bytes, checksum: decae1714df7e5372780a0a5f6282846 (MD5)
application/pdf
eng
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
https://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2022-2023
Predictive modeling
Machine learning
User segmentation
Modelado predictivo
Aprendizaje automático
Segmentación de usuarios
Exploring user retention in enhance VR: a comprehensive analysis using predictive models and clustering
info:eu-repo/semantics/masterThesis
THUMBNAIL
BSETFM23_OdiPiss.pdf.jpg
BSETFM23_OdiPiss.pdf.jpg
IM Thumbnail
image/jpeg
13264
http://repositori.upf.edu/bitstream/10230/58358/3/BSETFM23_OdiPiss.pdf.jpg
dcb9833906ff654ccfd8de50593b5648
MD5
3
TEXT
BSETFM23_OdiPiss.pdf.txt
BSETFM23_OdiPiss.pdf.txt
Extracted text
text/plain
87213
http://repositori.upf.edu/bitstream/10230/58358/2/BSETFM23_OdiPiss.pdf.txt
b25b689cc5bfbe2189903cc58c2acbb1
MD5
2
ORIGINAL
BSETFM23_OdiPiss.pdf
BSETFM23_OdiPiss.pdf
application/pdf
1192363
http://repositori.upf.edu/bitstream/10230/58358/1/BSETFM23_OdiPiss.pdf
decae1714df7e5372780a0a5f6282846
MD5
1
10230/58358
oai:repositori.upf.edu:10230/58358
2023-11-23 14:05:22.93
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/583592023-11-23T13:05:50Zcom_10230_26995com_10230_16441col_10230_26998
De los Santos, Daniela
Frey, Eric
Vassallo, Renato
2023-11-22T17:09:29Z
2023-11-22T17:09:29Z
2023-08-18
http://hdl.handle.net/10230/58359
Treball fi de màster de: Master's Degree in Data Science: Data Science for Decision Making Program. Curs 2022-2023
Tutors: Hannes Mueller i Christian Brownlees
This study presents a novel forecasting framework for global refugee flows,
incorporating non-conventional data sources such as Google Trends, the GDELT project event dataset, conflict forecasts, among others. Our main objective is to generate accurate predictions for the number of new refugee arrivals per country pair, in order to help facilitate effective humanitarian response. We develop a comprehensive global model which predicts refugee outflows and country-pair flows separately. Our results reveal a significant improvement in prediction accuracy by augmenting traditional variables with non-conventional data, with Random Forest and Gradient Boosting as effective regressors.
Este estudio introduce un novedoso marco de forecasting para flujos globales de personas refugiadas, incorporando fuentes de datos no convencionales como Google Trends, el proyecto GDELT, entre otros. Nuestro objetivo principal es generar predicciones precisas para la cantidad de nuevos arribos de refugiados por pares de países, con el fin de facilitar una respuesta humanitaria efectiva. Desarrollamos un modelo global integral que predice flujos de emigrantes y flujos entre pares de países de manera individual. Nuestros resultados muestran una mejora significativa en la precisión de las predicciones al ampliar variables tradicionales con datos no convencionales, utilizando Random Forest y Gradient Boosting como regresores efectivos.
Submitted by MARIA PORT PUIG (maria.port@upf.edu) on 2023-11-22T17:09:29Z
No. of bitstreams: 1
BSETFM23_DelosFreyVass.pdf: 1679821 bytes, checksum: e10893be1bf0e173120f11a012a8939e (MD5)
Made available in DSpace on 2023-11-22T17:09:29Z (GMT). No. of bitstreams: 1
BSETFM23_DelosFreyVass.pdf: 1679821 bytes, checksum: e10893be1bf0e173120f11a012a8939e (MD5)
application/pdf
eng
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
https://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2022-2023
Forecasting
Refugee flows
Google trends
Predicción
Refugiados
Forecasting global refugee flows: a machine learning approach using non-conventional data
info:eu-repo/semantics/masterThesis
THUMBNAIL
BSETFM23_DelosFreyVass.pdf.jpg
BSETFM23_DelosFreyVass.pdf.jpg
IM Thumbnail
image/jpeg
12405
http://repositori.upf.edu/bitstream/10230/58359/3/BSETFM23_DelosFreyVass.pdf.jpg
0770340c021a32db1359d81d77a3cb24
MD5
3
TEXT
BSETFM23_DelosFreyVass.pdf.txt
BSETFM23_DelosFreyVass.pdf.txt
Extracted text
text/plain
51362
http://repositori.upf.edu/bitstream/10230/58359/2/BSETFM23_DelosFreyVass.pdf.txt
6ca7eb0180c2aaf5d08c0c03ca186a9c
MD5
2
ORIGINAL
BSETFM23_DelosFreyVass.pdf
BSETFM23_DelosFreyVass.pdf
application/pdf
1679821
http://repositori.upf.edu/bitstream/10230/58359/1/BSETFM23_DelosFreyVass.pdf
e10893be1bf0e173120f11a012a8939e
MD5
1
10230/58359
oai:repositori.upf.edu:10230/58359
2023-11-23 14:05:50.913
Repositori digital de la UPF
repositori@upf.edu
oai:repositori.upf.edu:10230/583672023-12-07T08:24:27Zcom_10230_26995com_10230_16441col_10230_26998
Chaves, Giovanna
Philipp, Margherita
Quiñones, Luis
2023-11-23T18:54:29Z
2023-11-23T18:54:29Z
2023-07
http://hdl.handle.net/10230/58367
Treball fi de màster de: Master's Degree in Data Science: Data Science for Decision Making Program. Curs 2022-2023
Tutors: Jesús Cerquides i Hannes Mueller
Advances in data and computing techniques have opened possibilities for real-time and cost-efficient conflict prediction and early warning capabilities, with news-based data being utilized to generate relevant forecasts. This Master’s thesis explores the use of big data news media for conflict prediction and anticipatory decision-making, with a focus on harnessing the Global Database of Events, Language and Tone (GDELT). We investigate the effectiveness of using GDELT events to predict conflict at the country-level by extracting relevant features and comparing the performance of text-based models with different target definitions and time horizons. The results show that GDELT-based features perform well in conflict prediction, particularly in tree-based and LSTM models, indicating the value of using text data for capturing patterns and providing insights into potential conflict events.
Avances en data y computación han abierto la posibilidad de la predicción costo eficiente y en tiempo real de conflictos, con capacidad de alerta temprana, utilizando datos basados en noticias para generar pronósticos relevantes. Esta tesis de maestría explora el uso de medios de comunicación de noticias de ‘big data’ para la predicción de conflictos y la toma de decisiones anticipadas, con un enfoque en aprovechar la Base de Datos Global de Eventos, Lenguaje y Tono (GDELT). Investigamos la efectividad de utilizar los eventos de GDELT para predecir conflictos a nivel de país mediante la extracción de características relevantes y comparando el rendimiento de modelos basados en texto con diferentes definiciones de objetivo y horizontes temporales. Los resultados muestran que las características basadas en GDELT tienen un buen desempeño en la predicción de conflictos, especialmente en modelos basados en algoritmos ‘Trees’ y LSTM, lo que indica el valor de utilizar datos de texto para capturar patrones y proporcionar información sobre posibles eventos de conflicto.
Submitted by MARIA PORT PUIG (maria.port@upf.edu) on 2023-11-23T18:54:29Z
No. of bitstreams: 1
BSETFM23_QuiChaPhi.pdf: 4558843 bytes, checksum: 0db29e19a12c16a482e8771c741a9cac (MD5)
Made available in DSpace on 2023-11-23T18:54:29Z (GMT). No. of bitstreams: 1
BSETFM23_QuiChaPhi.pdf: 4558843 bytes, checksum: 0db29e19a12c16a482e8771c741a9cac (MD5)
application/pdf
eng
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
https://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Treball de fi de màster – Curs 2022-2023
Prediction
Conflicts
GDELT
Predicción
Conflictos
Harnessing big data news media for conflict prediction and anticipatory decision-making
info:eu-repo/semantics/masterThesis
THUMBNAIL
BSETFM23_QuiChaPhi.pdf.jpg
BSETFM23_QuiChaPhi.pdf.jpg
IM Thumbnail
image/jpeg
13130
http://repositori.upf.edu/bitstream/10230/58367/3/BSETFM23_QuiChaPhi.pdf.jpg
dd0736b4cde49af9572fae3dbcfa9627
MD5
3
TEXT
BSETFM23_QuiChaPhi.pdf.txt
BSETFM23_QuiChaPhi.pdf.txt
Extracted text
text/plain
78931
http://repositori.upf.edu/bitstream/10230/58367/2/BSETFM23_QuiChaPhi.pdf.txt
ab3fb28c80bd75585f0cd58353a5d424
MD5
2
ORIGINAL
BSETFM23_QuiChaPhi.pdf
BSETFM23_QuiChaPhi.pdf
application/pdf
4558843
http://repositori.upf.edu/bitstream/10230/58367/1/BSETFM23_QuiChaPhi.pdf
0db29e19a12c16a482e8771c741a9cac
MD5
1
10230/58367
oai:repositori.upf.edu:10230/58367
2023-12-07 09:24:27.919
Repositori digital de la UPF
repositori@upf.edu