Alessandro Finamore

Paris, FRANCE · mail@afinamore.io

I am a researcher working in Internet measurements at the intersection between Deep Learning, BigData and data-plane programming. Currently, I am a Principal Engineer working at the Huawei AI4NET Datacom lab in Paris (France) focusing on the integration of Deep Learning into traffic monitoring systems for continuous learning and network automation.

Previously, I was a research associate at Telefonica Research, and a Principal Engineer at Telefonica UK/O2, where I designed and deployed in production an ML product to predict daily customer satisfaction for 30M+ O2 customers using a variety of live network logs.


Projects

A Machine Learning and Deep Learning modeling framework for Traffic Classification
Github Documentation

Talks

Taming the Data Divide to Enable AI-Driven Networks SLIDES details
IFIP Traffic Measurements and Analysis (TMA) ⋅ keynote ⋅ Jun, 2022
Artificial Intelligence (AI) is more and more penetrating networks design and operation. Yet, if/what we need to and how/where to integrate AI in networks is still largely a debate, arguably due to the fundamental need for effective sharing of data and measurements. In this talk we review the challenges surrounding this renewed “data divide” and discuss possible ways to mitigate them.

Network Intelligence for the future SLIDES
Mediterranean Communication and Computer Networking Conference (MedComNet) ⋅ panel ⋅ Jun, 2021

Research Papers

2024
  • Data Augmentation for Traffic Classification PDF details
    C.Wang, A. Finamore, P. Michiardi, M. Gallo, D. Rossi
    Passive and Active Measurements (PAM)
    Data Augmentation (DA) -- enriching training data by adding synthetic samples -- is a technique widely adopted in Computer Vision (CV) and Natural Language Processing (NLP) tasks to improve models performance. Yet, DA has struggled to gain traction in networking contexts, particularly in Traffic Classification (TC) tasks. In this work, we fulfill this gap by benchmarking 18 augmentation functions applied to 3 TC datasets using packet time series as input representation and considering a variety of training conditions. Our results show that (i) DA can reap benefits previously unexplored, (ii) augmentations acting on time series sequence order and masking are better suited for TC than amplitude augmentations and (iii) basic models latent space analysis can help understanding the positive/negative effects of augmentations on classification performance.

    @inproceeding{AF:PAM24, title={Toward Generative Data Augmentation for Traffic Classification}, author={C. {Wang} and A. {Finamore} and P. {Michiardi} and M. {Gallo} and D. {Rossi}}, year={2024}, booktitle={Passive and Active Measurements (PAM)}, location={Virtual}, doi={10.1007/978-3-031-56249-5_7}, howpublished="https://afinamore.io/pubs/PAM24_data-augmentation.pdf" }
2023
  • Toward Generative Data Augmentation for Traffic Classification PDF details
    C.Wang, A. Finamore, P. Michiardi, M. Gallo, D. Rossi
    Student workshop at ACM Conference on emerging Networking Experiments and Technologies (CoNEXT)
    Data Augmentation (DA)--augmenting training data with synthetic samples—is wildly adopted in Computer Vision (CV) to improve models performance. Conversely, DA has not been yet popularized in networking use cases, including Traffic Classification (TC). In this work, we present a preliminary study of 14 hand-crafted DAs applied on the MIRAGE19 dataset. Our results (i) show that DA can reap benefits previ- ously unexplored in TC and (ii) foster a research agenda on the use of generative models to automate DA design.

    @inproceeding{AF:CoNEXT23, title={Toward Generative Data Augmentation for Traffic Classification}, author={C. {Wang} and A. {Finamore} and P. {Michiardi} and M. {Gallo} and D. {Rossi}}, year={2023}, booktitle={Conference on emerging Networking Experiments and Technologies (CoNEXT), Student workshop}, location={Paris, France}, doi={}, howpublished="https://afinamore.io/pubs/CoNEXT23_sw_handcrafted_da.pdf" }
  • Replication: Contrastive Learning and Data Augmentation in Traffic Classification Using a Flowpic Input Representation PDF SLIDES details
    ACM Internet Measurement Conference (IMC)
    The popularity of Deep Learning (DL), coupled with network traffic visibility reduction due to the increased adoption of HTTPS, QUIC and DNS-SEC, re-ignited interest towards Traffic Classification (TC). However, to tame the dependency from task-specific large labeled datasets we need to find better ways to learn representations that are valid across tasks. In this work we investigate this problem comparing transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models (16 methods total). Using two publicly available datasets, namely MIRAGE19 (40 classes) and AppClassNet (500 classes), we show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology and (iii) meta-learning the worst one, and (iv) while ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.

    @inproceeding{AF:IMC23, title={Replication: Contrastive Learning and Data Augmentation in Traffic Classification Using a Flowpic Input Representation}, author=, year={2023}, booktitle={Internet Measurement Conference (IMC)}, location={Montreal, Canada}, doi={}, howpublished="https://afinamore.io/pubs/IMC23_replication.pdf" }
  • Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification PDF details
    I. Guarino, C. Wang, A. Finamore, A. Pescape, D. Rossi
    IEEE/IFIP Traffic Measurement and Analysis (TMA)
    The popularity of Deep Learning (DL), coupled with network traffic visibility reduction due to the increased adoption of HTTPS, QUIC and DNS-SEC, re-ignited interest towards Traffic Classification (TC). However, to tame the dependency from task-specific large labeled datasets we need to find better ways to learn representations that are valid across tasks. In this work we investigate this problem comparing transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models (16 methods total). Using two publicly available datasets, namely MIRAGE19 (40 classes) and AppClassNet (500 classes), we show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology and (iii) meta-learning the worst one, and (iv) while ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.

    @article{AF:TMA23, title={Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification}, author={I. {Guarino} and C. {Wang} and A. {Finamore} and A. {Pescape} and D. {Rossi}}, year={2023}, booktitle=Traffic Measurement and Analysis (TMA), location={Naples, Italy}, doi={10.48550/arXiv.2305.12432}, howpublished="https://afinamore.io/pubs/TMA23_manyorfew.pdf" }
  • "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning PDF details
    R. Azorin, M. Gallo, A. Finamore, D. Rossi, P. Michiardi
    International Workshop on Practical Deep Learning in the Wild (PracticalDL) - colocated with AAAI
    While the promises of Multi-Task Learning (MTL) are attractive, characterizing the conditions of its success is still an open problem in Deep Learning. Some tasks may benefit from being learned together while others may be detrimental to one another. From a task perspective, grouping cooperative tasks while separating competing tasks is paramount to reap the benefits of MTL, i.e., reducing training and inference costs. Therefore, estimating task affinity for joint learning is a key endeavor. Recent work suggests that the training conditions themselves have a significant impact on the outcomes of MTL. Yet, the literature is lacking of a benchmark to assess the effectiveness of tasks affinity estimation techniques and their relation with actual MTL performance. In this paper, we take a first step in recovering this gap by (i) defining a set of affinity scores by both revisiting contributions from previous literature as well presenting new ones and (ii) benchmarking them on the Taskonomy dataset. Our empirical campaign reveals how, even in a small-scale scenario, task affinity scoring does not correlate well with actual MTL performance. Yet, some metrics can be more indicative than others.

    @article{AF:PracticalDL23, title={"It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning}, author={R. {Azorin} and M. {Gallo} and A. {Finamore} and D. {Rossi} and P. {Michiardi}}, year={2023}, booktitle={International Workshop on Practical Deep Learning in the Wild (PracticalDL)}, location={Washington, US}, doi={10.48550/arXiv.2301.02873}, howpublished="https://afinamore.io/pubs/PracticalDL23_mtl.pdf" }
2022
  • Accelerating Deep Learning Classification with Error-controlled Approximate-key Caching PDF details
    A. Finamore, J. Roberts, M. Gallo, D. Rossi
    IEEE International Conference on Computer Communications (INFOCOM)
    While Deep Learning (DL) technologies are a promising tool to solve networking problems that map to classification tasks, their computational complexity is still too high with respect to real-time traffic measurements requirements. To reduce the DL inference cost, we propose a novel caching paradigm, that we named approximate-key caching, which returns approximate results for lookups of selected input based on cached DL inference results. While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error. As such, we couple approximate-key caching with an error-correction principled algorithm, that we named auto-refresh. We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching -- testifying the practical interest of our proposal.

    @inproceedings{AF:INFOCOM22, title={Accelerating Deep Learning Classification with Error-controlled Approximate-key Caching}, author={A. {Finamore} and J. {Roberts} and M. {Gallo} and D. {Rossi}}, year={2022}, booktitle={IEEE International Conference on Computer Communications (INFOCOM)}, location={Virtual Event}, doi={10.1109/INFOCOM48880.2022.9796677}, howpublished="https://afinamore.io/pubs/INFOCOM22_approximate_key_caching.pdf" }
  • Towards a systematic multi-modal representation learning for network data PDF details
    Z. B. Houidi, R. Azorin, M. Gallo, A. Finamore, D. Rossi
    ACM Workshop on Hot Topics in Networks (HotNets)
    Learning the right representations from complex input data is the key ability of successful machine learning (ML) models. The latter are often tailored to a specific data modality. For example, recurrent neural networks (RNNs) were designed having sequential data in mind, while convolutional neural networks (CNNs) were designed to exploit spatial correlation in images. Unlike computer vision (CV) and natural language processing (NLP), each of which targets a single well-defined modality, network ML problems often have a mixture of data modalities as input. Yet, instead of exploiting such abundance, practitioners tend to rely on sub-features thereof, reducing the problem to single modality for the sake of simplicity. In this paper, we advocate for exploiting all the modalities naturally present in network data. As a first step, we observe that network data systematically exhibits a mixture of quantities (e.g., measurements), and entities (e.g., IP addresses, names, etc.). Whereas the former are generally well exploited, the latter are often underused or poorly represented (e.g., with one-hot encoding). We propose to systematically leverage language models to learn entity representations, whenever significant sequences of such entities are historically observed. Through two diverse use-cases, we show that such entity encoding can benefit and naturally augment classic quantity-based features.

    @inproceedings{AF:HotNets22, title={Towards a systematic multi-modal representation learning for network data}, author={Z. B. {Houidi} and R. {Azorin} and M. {Gallo} and A. {Finamore} and D. {Rossi}}, year={2022}, booktitle={ACM Workshop on Hot Topics in Networks (HotNets)}, doi={10.1145/3563766.3564108}, howpublished="https://afinamore.io/pubs/HotNets22_representation.pdf" }
  • AppClassNet: a commercial-grade dataset for application identification research PDF details
    C. Wang, A. Finamore, L. Yang, K. Fauvel, D. Rossi
    ACM Computer Communication Review (CCR)
    The recent success of Artificial Intelligence (AI) is rooted into several concomitant factors, namely theoretical progress coupled with abundance of data and computing power. Large companies can take advantage of a deluge of data, typically withhold from the research community due to privacy or business sensitivity concerns, and this is particularly true for networking data. Therefore, the lack of high quality data is often recognized as one of the main factors currently limiting networking research from fully leveraging AI methodologies potential. Following numerous requests we received from the scientific community, we release AppClassNet, a commercial-grade dataset for benchmarking traffic classification and management methodologies. AppClassNet is significantly larger than the datasets generally available to the academic community in terms of both the number of samples and classes, and reaches scales similar to the popular ImageNet dataset commonly used in computer vision literature. To avoid leaking user- and business-sensitive information, we opportunely anonymized the dataset, while empirically showing that it still represents a relevant benchmark for algorithmic research. In this paper, we describe the public dataset and our anonymization process. We hope that AppClassNet can be instrumental for other researchers to address more complex commercial-grade problems in the broad field of traffic classification and management.

    @article{AF:CCR22, title={AppClassNet: a commercial-grade dataset for application identification research}, author={C. {Wang} and A. {Finamore} and L. {Yang} and K. {Fauvel} and D. {Rossi}}, year={2022}, booktitle={ACM Computer Communication Review (CCR)}, doi={10.1145/3561954.3561958}, howpublished="https://afinamore.io/pubs/CCR22_appclassnet.pdf" }
full list

Experience

Principal Engineer

HUAWEI · Paris, France

My current role is at the cross-over between BigData, networks, and AI. I work in the Huawei DataCom R&D AI team focused on integrating AI in data-plane programming, distributed telemetry and other network monitoring solutions for the Huawei DataCom product line. In particular, I’m leading the research related to traffic classification with an emphasis towards continual learning (e.g., incremental learning and few-shot learning) and data augmentation (e.g., self-supervision). I’m also responsible for the design and prototype of next-generation network probes which can take advantage of ML/DL via advanced GPU/TPU cards (e.g., Huawei Ascend 310 / 910) to compute advanced network analytics.

September 2019 - Present

Principal Engineer

Telefonica UK / O2 · London, UK

Started as research experiment and later graduated to product, I was leading the design and development of BigData analytics (using Apache Spark) and ML applications (using Keras, and scikit-lear). I took advantage of a large on-premise Hadoop cluster (250+ nodes) where data collected from different network core monitoring elements were stored, to create insights about users quality of experience (QoE), that were used to model >30M customer satisfaction. I’ve been responsible for the design, implementation, and operation of the whole pipeline (analytics+modeling) which has been successfully used internally. In parallel, I was still part of the research community (in particular related to traffic analysis), with different collaborations with universities and other research centres.

January 2018 - September 2019

Research Associate

Telefonica Research · Barcelona, Spain

I worked on research projects related to mobile network analytics spanning from traffic encryption (e.g., HTTP2 adoption and performance), to users quality of experience (e.g., mobile critical path analysis) and users behavior (e.g., users mobility). I also collaborated with different operational business within Telefonica global (e.g., O2/UK, Movistar/Peru, Movistar/Argentina) across different projects related to network analytics and bigdata (e.g., use radio tower KPIs to understand users experience).

December 2014 - January 2018

Visiting Scholar

Narus Inc · Sunnyvale, CA, U.S

I worked on developing novel techniques for identifying and dissecting network traffic generated by malware executable, rootkit applications, and more general mobile/host traffic behaviors. The techniques developed lead to discovery of security issues actually exploited in the wild. I worked on developing novel techniques for identifying and dissecting network traffic generated by malware executable, rootkit applications, and more general mobile/host traffic behaviors. The techniques developed lead to discovery of security issues actually exploited in the wild.

Oct 2013 - September 2014

Internship

Telefonica I+D · Barcelona, Spain

I worked on a project to extract network analytics from a country-scale dataset by means of an Hadoop Cluster. The work lead to publication of one of the first studies related to the mobile ads ecosystem. I worked on a project to extract network analytics from a country-scale dataset by means of an Hadoop Cluster. The work lead to publication of one of the first studies related to the mobile ads ecosystem.

January 2012 - April 2014

Visiting PhD Student

Purdue University · Lafayette, Indiana, U.S.

I worked on a research project related to understanding the YouTube CDN by means of data gathered from passive network probes we deployed in Italy and and Poland ISPs. We have been the first to uncover YouTube CDN dynamics and (at the time) the aggressive buffering of the YouTube player. I worked on a research project related to understanding the YouTube CDN by means of data gathered from passive network probes we deployed in Italy and and Poland ISPs. We have been the first to uncover YouTube CDN dynamics and (at the time) the aggressive buffering of the YouTube player.

September 2010 - May 2011

Education

Politecnico di Torino

Ph.D. in Electronics and Telecommunication Engineering.

2008 - 2012

Politecnico di Torino

Bachelor of Science in Computer Engineering (110/110)

2005 - 2008

Politecnico di Torino

Master of Science in Computer Engineering (110L/110)

2001 - 2005