Ferres, LeoSacasa Ares, Manuel Antonio2021-08-302021-08-302020-08http://hdl.handle.net/11447/4531Tesis presentada a la Facultad de Ingeniería de la Universidad del Desarrollo para optar al título de Magíster en Ciencia de DatosThe detection of Home location is one of the main profiling attributes for telcos, real estate, banking, advertising and targeting companies. Literature shows many examples of criteria, heuristics and algorithms trying to solve the problem. In this document, the objective is to select a final criteria and data source comparing: 5 algorithm, a 2-week data set of three mobile data sources (CDRs-XDRs-Control Plane) and a 75-inhabitants real address data set, built with a survey method, as a ground truth test data set. This lead us to understand: - The behavior of the data source: users hit frequency and the impact on algorithm results, method of processing home antenna ranking one, human dependency day-night source behavior and home antenna precision, hit recording logic, noise, error and their limits to solve home location. - The possible solutions: type of criteria, Geo-time heuristics, counting algorithm types (one or two-step), performance and precision of each algorithm linked to each source. - The metrics: Test correlation between the closest tower in Euclidean distance versus home tower (output of each algorithm), bias result from 3G data vs. 4g(Lte) antennas, measured Euclidean distance as error between real declared address coordinates and home tower when it’s the closest tower or not. The first step of the document takes the recommendations and implementation of HDAs (Home detection algorithms) and criteria [1][3][5][6][14][16] to compare the difference and behavior between sources and criteria applied to each algorithm (Ranking antenna’s hits, ranking antenna’s frequency, ranking with time filter, ranking with geographical filter and mix geo-time filters). The second step is to design metrics in order to compare the pair of best performers algorithm with their source: binary metric for a match between HDA result and home antenna ranking, valued match between ranking 1-2-3 home antenna, absolute error distance and MSE distance to real address. The final step is to test an experimental new approach, applying a circadian sleep cycle for each user, to detect time range and process an individual time-range home antenna as a solution for the gaps detected in the first two steps. Comparing circadian results versus the best HDA method, source and groundtruth.20 p.enHomeLocationMobileResidenceCDRXDRSignalizationMobilityUrban070037SHome location detection algorithm comparison using mobile phone data vs real users ground truthThesis