CC* Integration-Large: Bringing Code to Data: A Collaborative Approach to Democratizing Internet Data Science
Funding source: NSF OAC-2126281. Period of performance: 10/01/2021 -- 09/30/2024.Project Overview
Successful application of machine learning (ML) for networking problems depends on the availability of high-quality labeled data from real-world networks. Equally critical is the ability to share these datasets, respecting the data owners' privacy concerns. Unfortunately, short of sharing the data via today’s commonly-applied data-to-code paradigm, researchers lack a systematic framework for working with or benefiting from data collected and curated by third parties. Consequently, Internet Data Science as practiced today is ill-suited for applications such as (i) high-quality data labeling, (ii) rigorous evaluation of research artifacts such as learning models, and (iii) independent validation/reproducibility of reported research findings.
This collaborative project brings together researchers from UO, UCSB, and NIKSUN, Inc. and will investigate an innovative collaborative data labeling and knowledge sharing framework in three thrusts. First, the project will investigate a novel code-to-data approach that entails sharing of programmatic representations of operators' domain knowledge to identify events of interest in the data. Second, the project will design and develop a new learning framework to enable the pursuit of Internet Data Science as a full-fledged collaborative effort. Third, the project will illustrate the capabilities of the proposed framework in the context of collaborative efforts between two participating universities (UO and UCSB) and demonstrate its ability to scale to any number of participants.
The resulting framework will serve as a driving force for advancing collaborative efforts in the emerging area of Internet Data Science. In addition to identifying some of the fundamental changes to how ML ought to be used in networking, the research findings will benefit both industry and academia and will ensure that tomorrow's workforce has the proper training to fully exploit the application of ML for network-specific problems. Also, the outcomes will catalyze the development of a roadmap for the adoption of Internet Data Science efforts by operators and the deployment of ensuing research artifacts in real-world production networks.
People
- Lead PI: Ram Durairajan
- Co-PIs: Reza Rejaie (Co-PI, UO), Jon Miyake (Co-PI, UO), Arpit Gupta (Co-PI, UCSB), Walter Willinger (Senior Personnel, NIKSUN, Inc.)
- Ph.D. Students: TBD
- M.S. Students: Abduarraheem Elfandi, Mana Atarod
- B.S. Students Alumni: Jared Knofcynzski
Publications
- Building Trust in Machine Learning-Powered Networking: The Network Explainer Framework
 Riya Ponraj, Ramakrishnan Durairajan, and Yu Wang
 In Proceedings of SIAM Data Mining (SDM '25) AI4TS workshop, Virginia, US, May 2025.
 [PAPER]    
 
- Bootstrapping Trust in ML4Nets Solutions with Hybrid Explainability
 Abduarraheem Elfandi, Hannah Sagalyn, Ramakrishnan Durairajan and Walter Willinger
 In Proceedings of workshop on Practical Adoption Challenges of ML for Systems (PACMI)
 co-located with ACM SOSP'24, Austin, TX, November 2024.
 [PAPER]    
 
- Leveraging Prefix Structure to Detect Volumetric DDoS Attack Signatures with Programmable Switches
 Chris Misa, Ramakrishnan Durairajan, Arpit Gupta, Reza Rejaie and Walter Willinger
 In IEEE Symposium on Security and Privacy (S&P) (Oakland '24), San Francisco, CA, May 2024.
 [PAPER]     [CODE]    
 
- Data-Fusion for Prefix-Level Inference: A DDoS Case Study
 Chris Misa, Ramakrishnan Durairajan, Reza Rejaie and Walter Willinger
 In Security Datasets for AI (SECDAI) workshop, virtual, April 2024.
 [PAPER]    
 
- Network Management with Graph Machine Learning: Challenges and Solutions
 Yu Wang and Ramakrishnan Durairajan
 In Security Datasets for AI (SECDAI) workshop, virtual, April 2024.
 [PAPER]    
 
- DynATOS+: A Network Telemetry System for Dynamic Traffic and Query Workloads
 Chris Misa, Ramakrishnan Durairajan, Reza Rejaie and Walter Willinger
 In IEEE/ACM Transactions on Networking, 2024.
 [PAPER]    
 
- Special Issue on The ACM SIGMETRICS Workshop on Measurements for Self-Driving Networks
 Arpit Gupta, Ramakrishnan Durairajan and Walter Willinger
 In Proceedings of ACM SIGMETRICS Performance Evaluation Review, 2023.
 [PAPER]    
 
- ARISE: A Multi-Task Weak Supervision Framework for Network Measurements
 Jared Knofczynski, Ramakrishnan Durairajan and Walter Willinger
 In IEEE JSAC Series on Machine Learning in Communications and Networks, July 2022.
 [PAPER]     [CODE]    
 
- Dynamic Scheduling of Approximate Telemetry Queries
 Chris Misa, Walt O'Connor, Ramakrishnan Durairajan, Reza Rejaie and Walter Willinger
 In Proceedings of USENIX NSDI'22, Renton, WA, April 2022.
 [PAPER]     [PROJECT WEBSITE]     [CODE]    
 
 
- Revisiting Network Telemetry in COIN: A Case for Runtime Programmability
 Chris Misa, Ramakrishnan Durairajan, Reza Rejaie and Walter Willinger
 In IEEE Network (In-Network Computing: Emerging Trends for the Edge-Cloud Continuum), September 2021.
 [PAPER]     [PROJECT WEBSITE]    
 
- Challenges in Using ML for Networking Research: How to Label If You Must
 Yukhe Lavinia, Ramakrishnan Durairajan, Reza Rejaie and Walter Willinger
 In Proceedings of Workshop on Network Meets AI & ML (NetAI'20)
 co-located with ACM SIGCOMM'20, New York, USA, August 2020.
 [PAPER]    
 
