Giulia Preti

PhD @ University of Trento

«That quite definitely is the answer. I think the problem, to be quite honest with you, is that you’ve never actually known what the question is.»
“The Hitchhiker's Guide to the Galaxy” by Douglas Adams

All hail!

I am Giulia Preti.

I am a member of the DbTrento research group, working on techniques for mining relevant structures in dynamic and heterogeneous datasets.

I got my PhD in Information and Communication Technology at the University of Trento (Italy), under the supervision of Prof. Yannis Velegrakis.

I was also the teaching assistant for the Computability and Computational Complexity course from 2015 to 2018.

I got my master's degree in Computer Science and my bachelor's degree in mathematics, both of which pursued at the University of Trento.


[Full CV]

Research Interests:

My research focuses on graphs, a versatile data model that has been increasingly used to represent a large plethora of data, from biology to social networks, and from computer networks to smart cities. In particular, I consider weighted graphs and dynamic graphs.

Weighted graphs are graphs whose nodes and edges are labeled with weights indicating their relevance or quality. Moreover, in applications aiming at offering personalized products and services to each individual user rather than ``one size fits all'' solutions, each element of the graph naturally carries multiple weights, one for each user. My goal is to identify structures that appear frequently in the graph and whose appearances are characterized by large weights, and hence are relevant for the user, under the assumption that larger weights indicate higher interest.

Dynamic graphs are graphs that change over time, meaning that their nodes and edges can undergo both structural and attribute changes. They are generally modeled as sequences of static graphs called snapshots. In this context, I am interested in detecting groups of edges that evolve in a convergent manner, meaning that they display a positive correlation on their behavior. These groups of correlated edges, especially when they involve edges that are topologically close, can represent regions of interest in the network.

During my PhD studies, I also worked on entity resolution in highly heterogeneous and temporal databases, defined as collections of records characterized by different schemata and timestamps indicating the date of creation. The reconciliation of the records in this kind of situation, requires specialized similarity functions that take into consideration both the heterogeneity and the dynamism of the data. In my work, I proposed a suitable time-aware schema-agnostic similarity measure and a framework that uses this measure to identify maximal groups of similar temporal records.

Publications:

ExCoDE: a Tool for Discovering and Visualizing Regions of Correlation in Dynamic Networks.


(to appear in ICDMW19)
(PDF) (poster)

Mining Patterns in Graphs with Multiple Weights.


Distributed and Parallel Databases Journal, 1-39, 2019
https://doi.org/10.1007/s10619-019-07259-w

Beyond Frequencies: Graph Pattern Mining in Multi-weighted Graphs.


Proceedings of the 21st International Conference on Extending Database Technology (EDBT), March 26-29, 2018
(PDF) (PPT)

Projects and Code:

ReSuM

Overview

ReSuM is a framework to mine relevant patterns from large weighted and multi-weighted graphs. Assuming that the importance of a pattern is determined not only by its frequency in the graph, but also by the edge weights of its appearances, we propose four scoring functions to compute the relevance of the patterns. These functions satisfy the apriori property, and thus can rely on efficient pruning strategies.

The framework includes an exact and an approximate mining algorithm. The first is characterized by intelligent storage and computation of the pattern scores, while the second is based on the aggregation of similar weighting functions to allow scalability and avoid redundant computations.

Code

The code of this project is publicly available on GitHub, while the datasets used in the experiments may be provided upon request.

ExCoDE

Overview

ExCoDE is a general framework to mine diverse dense correlated subgraphs from dynamic networks. The correlation of a subgraph is computed in terms of the minimum pairwise Pearson correlation between its edges. The density of a subgraph is computed either as the minimum average degree among the snapshots of the networks where the subgraph is active, or as the average average degree among the snapshots of the networks where the subgraph is active. The similarity between different subgraphs is measured as the Jaccard similarity between the corresponding sets of edges.

Code

The demo of this project is available on GitHub, together with several datasets. Additional information may be provided upon request.