Network Science Institute
Mary Elizabeth Sutherland is a senior editor where she handles manuscripts in the behavioral sciences. She previously handled manuscripts spanning the same general disciplines and topics at Nature Human Behaviour, as a senior editor, and at Nature Communications, as an associate editor. Prior to her editorial career, Mary Elizabeth obtained a PhD in cognitive neuropsychology from McGill University, where she worked in auditory cognitive neuroscience with Dr. Robert Zatorre. She continued her training both at the Max Planck Institute for Human and Cognitive Brain Sciences in Leipzig and at the Catholic University, in Santiago Chile. She was briefly a professor at this latter institute, in a new position that the university created to span the medical and social sciences, before she realized that she would be a better editor than researcher and moved back to New York to take the editorial position at Nature Communications.
updated 5 years ago
Abstract: Suppose we observe a matrix of data with a low-rank “signal” obscured by noise. The standard way to find the signal, at least approximately, is PCA (principal component analysis): just look at the eigenvectors of the matrix. For Gaussian noise, random matrix theory tells us exactly how well this works: that is, the accuracy we can achieve as a function of the signal-to-noise ratio. For tensors, such as three-index tables A_{ijk}, the situation is much more complex. Here there seems to be a “statistical-computational gap,” namely a regime where finding the signal is possible but exponentially hard. Physically, this corresponds to a “glass transition,” where the optimum becomes hidden behind an energy barrier. Mathematically, it means that we believe no polynomial-time algorithm exists, and that exhaustive search is necessary. I’ll give evidence for this exponential hardness by showing that no algorithm remotely similar to PCA can work. Along the way, I’ll give an introduction to tensor networks — a generalization of matrix products and traces thaat everyone, including network theorists, should know about.
Biography: Cristopher Moore received his B.A. in Physics, Mathematics, and Integrated Science from Northwestern University, and his Ph.D. in Physics from Cornell. From 2000 to 2012 he was a professor at the University of New Mexico, with joint appointments in Computer Science and Physics. Since 2012, Moore has been a resident professor at the Santa Fe Institute; he has also held visiting positions at École Normale Superieure, École Polytechnique, Université Paris 7, École Normale Superieure de Lyon, Northeastern University, the University of Michigan, and Microsoft Research. He has written 160 papers at the boundary between mathematics, physics, and computer science, ranging from quantum computing, social networks, and phase transitions in NP-complete problems and Bayesian inference, to risk assessment in criminal justice. He is an elected Fellow of the American Physical Society, the American Mathematical Society, and the American Association for the Advancement of Science. With Stephan Mertens, he is the author of The Nature of Computation from Oxford University Press.
Abstract: Many data are metric, that is, with distances between data points. Therefore, metric geometry should provide tools for their analysis. Since the most important concept of geometry is curvature, many curvature concepts have been proposed and developed in metric geometry. I will explain those concepts systematically and show how they can yield insight into data from a wide range of domains.
Bio: Jürgen Jost is a Professor of Mathematics and Director of the Max Planck Institute for Mathematics in the Sciences (MPI-MiS) in Leipzig, Germany, and an External Professor at the Santa Fe Institute in New Mexico, USA. He was born in Münster, Germany, in 1956. He studied mathematics, physics, economics and philosophy at the University of Bonn from 1975 to 1980, and in 1980 he also completed his PhD in mathematics at the same university. He has held various postdoctoral and visiting positions at IAS Princeton, UC San Diego, ANU Canberra, MSRI Berkeley, Harvard, ETH Zürich and IHES Paris. From 1984 to 1996 he was Professor of Mathematics at the Ruhr University Bochum, and in 1996 he moved to Leipzig, where together with Eberhard Zeidler and Stefan Müller, he founded the MPI-MiS. To date, Jürgen Jost has written more than 600 research articles and more than 20 books, spanning many different areas of mathematics and applied sciences, as well as philosophy and history of science. He has supervised more than 60 PhD students and numerous postdocs.
Abstract: Human mobility is a critical driver of epidemics by substantially altering the probability of encounters, patterns of exposure, and the likelihood of disease propagation. While long-range movements may shape patterns of pathogen importation, short-range mobility and contact structures amplify local epidemics. Characterizing mobility patterns and social mixing across scales is therefore essential for understanding why and how epidemics emerge and spread, as well as for developing effective prevention and control strategies. The COVID-19 crisis, sparked a data-sharing revolution, with network operators such as Orange and Telefonica, along with tech giants like Google, Apple, and Facebook, providing real-time aggregated mobility data from mobile phone traces to track human mobility and help fight the pandemic. Epidemiological research is now focused on developing novel mathematical and computational frameworks to integrate high-resolution mobility data into models, enabling both retrospective analyses and real-time epidemic monitoring. In my talk, I will discuss how we utilized these data during the early stages of COVID-19 in France to capture the dynamic shifts in social mixing caused by mobility interventions and address critical public health questions. Additionally, I will present a retrospective theoretical study that characterizes the mobility factors shaping geographical diffusion across scales in the United States and demonstrates a model designed to optimize reliability for outbreak response while balancing mobility data requirements.
Bio: Dr. Giulia Pullano is a postdoctoral fellow at Georgetown University in Washington, DC, USA, working in the Bansal Lab within the Biology Department. Her research focuses on developing mathematical and computational models to understand the geographical dynamics of human-to-human diseases and inform public health policies. She is particularly interested in characterizing seasonal patterns in human behavior and disruptions during epidemics or extreme events to integrate them into epidemic models and optimize public health interventions. From 2020 to 2022, Dr. Pullano has been actively involved in the COVID-19 pandemic response, advising French public health agencies and government authorities. Dr. Pullano earned her PhD in Biomathematics and Public Health from the French National Institute of Health and Medical Research (INSERM), Sorbonne University, and Orange S.A., under the supervision of Dr. Vittoria Colizza. She obtained a Master’s degree in Physics of Complex Systems from Università degli Studi di Torino in 2016 and a Bachelor's degree in Physics from Università degli Studi di Roma La Sapienza in 2014.
Abstract: There is a gender gap in human mobility, with women travelling shorter distances, visiting fewer unique locations, and exhibiting lower physical activity levels compared to men. Previous studies in geography, transportation, social sciences, and - more recently - in quantitative studies in Human Mobility, have emphasized the need to study the behavioral heterogeneities in mobility and explore human mobility from a gendered perspective. Human mobility is characterized by a remarkable regularity and predictability, largely driven by work-related commutes. Work often defines the need to be at a specific place (the workplace) at specific times (work hours) and for a fixed duration (the workday). This has led researchers to hypothesize that the notable gender differences in the labor market might underpin the observed differences in mobility patterns between men and women.
In this talk, we will examine the impact of work constraints and gender on human mobility using a large-scale dataset that captures the movements of 600.000 individuals who self-declared as female or male, spanning ten countries. We will explore well-known mobility metrics and the differences in the structure of individuals' networks of visited locations. Finally, I will show that gender differences in mobility persist even when work constraints are accounted for, suggesting that other factors—such as family obligations and societal norms— may play a role in shaping the gender differences in mobility.
Bio: Silvia is a PhD student at the Technical University of Denmark (DTU) and a member of the Social Complexity Lab, led by Laura Alessandretti and Sune Lehmann. Currently, she is a Visiting Scholar at the MIT Senseable City Lab, where she is be based until January 2025. Her research focuses on aspects of human online and offline behaviour using large-scale data, and methods from Complex Systems and Computational Social Science. A key aspect of her PhD work is investigating behavioral inequalities, with a particular emphasis on the gender differences in human mobility.
Abstract: Value Transmission on TikTok (Adolescence) Values are essential life goals that shape an individual's identity, choices, attitudes, and behaviors. Traditionally transmitted primarily through parents, value communication is undergoing a transformation with the rise of social media platforms like TikTok, which is now used by 67% of teenagers, with 50% engaging almost constantly. While much research on social media influencers has focused on marketing, the values conveyed through TikTok content remain underexplored. This presentation examines the values present in TikTok posts, remain underexplored. This presentation examines the values present in TikTok posts, the strategies influencers use to communicate them, and how adolescents perceive and adopt these values. We manually coded nearly 1,000 posts from 100 influencers across various genres for 19 values based on Schwartz’s framework and identified different communicative strategies. Additionally, we are developing an NLP tool to predict the values transmitted in TikTok content, allowing us to expand our datas and deepen our understanding of how these values influence today's youth. Our findings provide critical insights into how TikTok shapes adolescents' value systems, offering a fresh perspective on digital value transmission in the social media age.
Bio: I have a background in Computer Science and Bioinformatics, with over a decade of experience in the biotech industry, specializing in drug development and big data analysis. After earning a PhD with a focus on Next Generation Sequencing, I transitioned into education research. My current postdoctoral work explores intersections between computational methods and social science, particularly in understanding child development.
Abstract: While aggregated mobile device location data have been extensively used to model SARS-CoV-2 dynamics, relationships between mobility behavior and the transmission of other respiratory pathogens are less understood. Understanding the influence of human mobility on endemic pathogens is crucial for predictive purposes, especially as perturbed circulation can lead to overlapping epidemics of different pathogens, putting extreme strain on healthcare systems. In this seminar, I will present research investigating the effects of population behavior on the transmission of 17 endemic viruses and SARS-CoV-2 in Seattle, Washington, during pre- and post-pandemic years, using detailed data from a citywide respiratory pathogen surveillance study and high-resolution cellphone mobility data. I will highlight mobility metrics that are consistent leading indicators of outbreaks and compare patterns across pathogens with different transmission modes, seasonal cycles, and age distributions of infection. Additionally, I will discuss recent work linking the evolutionary and epidemiological dynamics of influenza in the US and future plans to explore the effects of decreased social distancing and waning immunity on the post-pandemic reemergence of respiratory syncytial virus (RSV) in Seattle.
Bio: Dr. Amanda Perofsky is a research scientist in the Brotman Baty Institute for Precision Medicine at the University of Washington. Prior to joining UW, she completed her PhD in Ecology, Evolution, and Behavior with Dr. Lauren Ancel Meyers at the University of Texas at Austin and a postdoctoral fellowship with Dr. Cécile Viboud at the Fogarty International Center, US National Institutes of Health. Dr. Perofsky’s research focuses on the ecological, evolutionary, and behavioral drivers of respiratory virus infections, with aims to improve infectious disease surveillance and better understand and predict recurring and emerging outbreaks. She applies statistical and computational approaches to study respiratory virus transmission patterns and epidemiology, with a particular focus on influenza and SARS-CoV-2. She also produces operational forecasts and projections of respiratory virus outbreaks.
Abstract: Community detection is one of the most relevant tasks in the analysis of graphs as it has been shown that many real-world networks show a community structure. While many community detection algorithms have been developed over the recent years, most of these are designed for standard single-layer graphs. However, this can be an oversimplification of reality. In the first part of the talk, we will deal with the community detection and graph semi-supervised learning issues extended to multiplex networks, i.e., networks with multiple layers having same node sets and no inter-layer connections. The contributions are both in the problems' formulation and in their resolution applying and adapting suited and tailored optimization methods. In the second part of the talk, we will focus on the analysis of collaborations between scholars. Collaboration is crucial for deepening existing knowledge and gaining exposure to new ideas. We will investigate how researchers influence each other with their research topics, and how the COVID-19 pandemic affected researcher collaborations.
Bio: Sara Venturini is currently a Postdoctoral Fellow at the MIT Senseable City Lab. She earned a Ph.D. in Computational Mathematics in 2023 from the University of Padova, where she started her academic career with a Bachelor’s and Master’s in Mathematics. In 2022, she won a fellowship within the AccelNet-MultiNet program, enabling her to visit Indiana University in Bloomington. Currently, she is interested in combining her computational and applied mathematics background with her passion for complex networks in real-world social science applications. Sara’s current research interests include higher-order networks, optimization methods, machine learning, and the science of science.
Abstract: Machine learning systems now routinely use embeddings in thousands of dimensions to extract patterns from large-scale network data. Should we embrace this data revolution and let go of the simpler theories of yore—the likes of the S1 and Bradley-Terry models? In this talk, I will argue that low-dimensional embedding can find concise, interpretable patterns in networks and thus have a place in any modern data science stack. I will illustrate this point through a number of stories about social hierarchies and decision-making.
Bio: Jean-Gabriel Young is an Assistant Professor of Mathematics and Statistics at The University of Vermont, where he also holds faculty affiliations at the Translational Global Infectious Diseases Research Center and the Vermont Complex Systems Center. Professor Young’s research is at the intersection of statistical inference, epidemiology, and complex systems. Previously, he was a James S. McDonnell Foundation Fellow at the Center for the Study of Complex Systems of the University of Michigan, mentored by Professor Mark Newman. He obtained his PhD in Physics from Université Laval, under the guidance of Prof. Louis J. Dubé and Prof. Patrick Desrosiers.
Abstract: Here we represent human lives in a way that shares structural similarity to language, and we exploit this similarity to adapt natural language processing techniques to examine the evolution and predictability of human lives based on detailed event sequences. We do this by drawing on a comprehensive registry dataset, which is available for Denmark across several years, and that includes information about life-events related to health, education, occupation, income, address and working hours, recorded with day-to-day resolution. We create embeddings of life-events in a single vector space, showing that this embedding space is robust and highly structured. Our models allow us to predict diverse outcomes ranging from early mortality to personality nuances, outperforming state-of-the-art models by a wide margin. Using methods for interpreting deep learning models, we probe the algorithm to understand the factors that enable our predictions. Our framework allows researchers to discover potential mechanisms that impact life outcomes as well as the associated possibilities for personalized interventions.
Bio: Sune is a Professor of Networks and Complexity Science at DTU Compute, Technical University of Denmark. I’m also a Professor of Social Data Science at the Center for Social Data Science (SODAS), University of Copenhagen. His work focuses on quantitative understanding of social systems based on massive data sets. A physicist by training, my research draws on approaches from the physics of complex systems, machine learning, and statistical analysis. I work on large-scale behavioral data and while my primary focus is on modeling complex networks, my research has made substantial contributions on topics such as human mobility, sleep, academic performance, complex contagion, epidemic spreading, and behavior on twitter.
Abstract: Circulation is the characteristic feature of successful currency systems, from community currencies to cryptocurrencies to national currencies. This talk will present a network approach to studying the circulation of money within such systems, touching on the data, the theory, and the tools. Modern payment infrastructure keeps digital transaction records that capture an ever greater share of circulation. A theory of walk processes on networks gives us a solid basis for representing such data, and lets us develop highly effective network analysis tools. We will discuss applied analyses of Sarafu, a digital community currency active in Kenya, and of a mobile money system elsewhere in Africa operating in the national currency. Several specific findings have concrete implications for humanitarian and development policy. More broadly, the ability to study the circulation of digital money in detail stands to accelerate our understanding of payment systems, the currency systems they comprise, the financial systems they underpin, and the economic systems they enable.
Bio: Dr. Carolina Mattsson is a network scientist developing analysis tools and modelling frameworks for studying the economy as a complex system. She is a Researcher at CENTAI Institute doing work on production networks, payment systems, and temporal networks. Aspects of her research are explicitly policy- or industry- facing, having participated in projects with the Dutch Ministry of Economics Affairs, Statistics Netherlands, Telenor Research, IFC (World Bank), ING, and Intesa Sanpaolo. Carolina has a PhD in Network Science from the Network Science Institute at Northeastern University. During her PhD, she was supported by the NSF Graduate Research Fellowship Program and as a member of the Lazer Lab. Before joining CENTAI, Carolina was postdoctoral researcher in the Computational Network Science group at Leiden University.
Abstract: State-of-the-Art Natural Language Processing (NLP) systems are trained on massive collections of data. Traditionally, NLP models are uni-modal: one form of data, e.g., textual data, is used for training. However, recent trends focus on multimodality, utilizing multiple forms of data in order to improve the system’s performance on classic tasks as well as broadening the capabilities of AI systems. Image and code are the two common modalities that are used in training popular tools such as OpenAI’s GPT and Google's Gemini, among other LLMs.. Language, however, is not merely a collection of stand-alone texts, nor texts merely grounded in image or aligned with code. Language is primarily used for communication between speakers in some social settings. The meaning (semantic, pragmatic) of a specific utterance is best understood by interlocutors that share some common ground and are aware of the context in which the communication takes place. In this talk I will demonstrate the the benefits of the multi-modal framework through three unique tasks: conversational stance detection, the detection of hate mongers, and through modeling distributed large-scale coordinated campaigns.
Bio: Dr. Oren Tsur is an Assistant Professor (Senior Lecturer) at the Department of Software and Information Systems Engineering at Ben Gurion University in Israel where he heads the NLP and Social dynamics Lab (NASLAB), and serve as the director of the newly founded Interdisciplinary Center for the Study of Digital Politics and Strategy (DPS@BGU). Oren’s work combines Machine Learning, Natural Language Processing (NLP), Social Dynamics, and Complex Networks. Specifically, Oren’s work varies from sentiment analysis to modeling speakers’ language preferences, hates-speech detection, community dynamics, and adversarial influence campaigns. Oren serves as an editor and Senior Program Committee member in venues like ACL, EMNLP, WSDM and ICWSM and as a reviewer for journals ranging from TACL to PNAS and Nature. Oren’s work was published in top NLP and Web Science venues. His work/s on sarcasm detection was listed in the “top 50 inventions of the year” in Time Magazine’s Special technology issue. Academic homepage: https://www.naslab.ise.bgu.ac.il/orentsur
Series: Spring Complexity Series
Abstract: A lot of recent research pays attention to the psychological and cognitive factors that explain engagement with political information (including misleading content). This work offers important insights that help design interventions at the level of individuals. However, more systemic approaches are also needed to capture the aggregate characteristics of the information environment individuals navigate – and help create. In this talk, I will discuss recent research that uses networks to map information environments based on exposure behaviors. These networks help us identify pockets of problematic content and the types of audiences more likely to engage with that material. They also help us compare (and differentiate) modes of exposure and the different layers that structure the current media environment.
Abstract: This talk will cover recent work related to training and evaluating graph ML models on synthetic graphs. First, we discuss GraphWorld, a framework and package for generating a high-diversity set of medium-scale graphs for finding edge-cases of GNN performance. Second, we discuss new graph generative models that have been added to GraphWorld since its release. Finally, we discuss more recent work on generating large, individual synthetic graphs, and the challenges involved in training a GNN model on such graphs.
Bio: John Palowitch is a Research Scientist in Google Research based in San Francisco, CA working at the intersection of graph machine learning and LLMs.
Abstract: Behind the blur caused by the high-dimensional nonlinear dynamics and the intricate organization of complex systems, hide essential mechanisms that explain the emergence of macroscopic phenomena. To uncover those mechanisms, it has been common practice for researchers to model complex systems using dynamics that depend upon low-rank matrices describing the networks of interactions---what we call the low-rank hypothesis. We present three indicators of the low-rank hypothesis and evidence of its ubiquity among random network models used in various fields of study, ranging from network science and machine learning to neuroscience. We then verify the hypothesis for real networks of various origins and use our observations to examine the repercussions of the low-rank hypothesis on nonlinear dynamics. In particular, we show that having networks described by low (effective) rank matrices enables the dimension reduction of the nonlinear dynamics they support. As a surprise, we find that higher-order interactions emerge naturally from an optimal dimension reduction, which demonstrates the profound interplay between the description dimension of a complex system and the possibility of having higher-order interactions.
Bio: Vincent Thibeault is a Ph.D. candidate in Physics at Université Laval in Québec City, co-advised by Antoine Allard and Patrick Desrosiers. His main research activities concern dynamical processes on networks and the optimal compression of their mathematical descriptions, with applications ranging from computational neuroscience to epidemic spreading. His latest publication, featured in Nature Physics, delves into the fundamental question of the low-dimensional representation of complex systems. Additionally, Vincent’s interests and research in complexity science extend to other areas, including synchronization, spectral graph theory, adaptation, and information theory.
FEATURING SPECIAL GUEST
Vice Admiral Vivek H. Murthy, MD, MBA
U.S. Surgeon General, Department of Health and Human Services
PANELISTS
Dolores Albarracin ( University of Pennsylvania)
Leticia Bode (Georgetown University)
Filippo Menczer ( Indiana University)
Brendan Nyhan ( Dartmouth College)
Katherine Ognyanova ( Rutgers University)
David Rand ( MIT)
FACILITATORS
Matthew Baum (Harvard University)
David Lazer ( Northeastern University)
Patrick Sayers
Ayan Chatterjee
Samuel Westby
Ula Widocki
In the first chapter, I describe a theoretical and computational infrastructure that allows us to ask whether a given network captures the most informative scale to model the dynamics in the system. We see that many real world networks (especially heterogeneous networks) exhibit an information holarchy whereby a coarse-grained, macroscale representation of the network has more effective information than the original microscale network. In the next chapter, I consider the challenging problem of comparing pairs of networks and quantifying their differences. These tools are broadly referred to as “graph distance” measures, and there are dozens used throughout Network Science. However, unlike in other domains of Network Science where rigorous benchmarks have been established to compare our surplus of tools, there is still no theoretically-grounded benchmark for characterizing these tools. To address this, I propose that simple, well-understood ensembles of random networks are natural benchmarks for network comparison methods. In this chapter, I characterize over 20 different graph distance measures, and I show how this simple within-ensemble graph distance can lead to the development of new tools for studying complex networks. The final chapter is an example of exactly that: I show how the within-ensemble graph distance can be used to characterize and evaluate different techniques for reconstructing networks from time series data. Tying together the original theme of using the “right” network, this chapter addresses one of the most fundamental challenges in Network Science: how to study networks when the network structure is not known. Whether it’s reconstructing the network of neurons from time series of their activity, or identifying whether one stock’s price fluctuations cause changes in another’s, this problem is ubiquitous when studying complex systems; not only that, there are (again) dozens of techniques for transforming time series data into a network. In this chapter, I measure the within-ensemble graph distance between pairs of networks that have been reconstructed from time series data using a given reconstruction technique. What I find is that different reconstruction techniques have characteristic distributions of distances and that certain techniques are either redundant or underspecified given other more comprehensive methods. Ultimately, the goal of this dissertation is to stress the importance of rigorous standards for the suite of tools we have in Network Science, which ultimately becomes an argument about how to make Network Science more useful as a science.
In this talk, I will show how to design deep learning models to learn from large-scale spatiotemporal data, especially for dealing with non-Euclidean geometry, long-term dependencies and incorporating logical/physical constraints. I will showcase the application of these models to a variety of problems in transportation, sports, circuit design, and aerospace control. I will also discuss the opportunities and challenges of applying deep learning to large-scale spatiotemporal data.
In the experimental studies of the adaptive immune system, we had observed a scale-free network governing the repertoire of memory T-cells (Naumov et al, 2003). At the molecular level, we observe that a memory immune response to influenza virus becomes diverse upon repeated exposures to the virus that can be modeled as a fractal self-similar system. Theoretical explanation of experimental findings has been described by the small-world construction (Ruskin and Burns, 2006) as a special case of the scale-free network (Albert and Barabasi, 2002). We then simulated the fractal behavior mimicking immune memory - its generation, maintenance and senescence (Naumova et al, 2008) and experimentally illustrated the general stability of the power-law structures and age-related changes. Our recent theoretical work confirms the assumptions that multiple expansion-contraction cycles define the robustness of immune response and correspond to memory formation (Saito and Narikiyo, 2011). Saito and Narikiyo had proposed the dynamical network of the adaptive immune system as a self-organized critical state in which the avalanche feedback reinforcement may reduce immunosenescence.
At the population level, we also observed the evidence of exposure to influenza as a marker of “immunological age.” In the cohort of healthy donors, each encounter with an infectious agent was unique for every person. Yet, the commonality in responses formed “immunological kinship” among all affected individuals, manifested by a preserved T-cell clonal pool. The diverse responses to flu and changes in diversity allow us to make an inference to “immunological kinship” and “immunological age.” Our experimental data indicate that at a certain point the continuing exposures to influenza begin to decrease the diversity of immune response. These observations lead us to explore theoretical conditions governing the “stable” and “volatile” components of the T-cell repertoires via dynamic neural networks. Such separation allowed us to detect a condition indicative of acceleration of immune aging. We derived the initial network parameters based on a specially designed anchored power-law regression fit of experimental data from middle-aged and older donors over time and illustrated age acceleration and immunosenescence in humans.
In my talk, I describe some recent works where we have leveraged data from public andcommercial entities in order (i) to infer how vital and liveable a city is, (ii) to find the urban conditions (e.g., mixed land use, mobility routines, safety perception levels, etc.) that magnify and influence urban life, and (iii) to study their relationship with societal outcomes such as criminality and urban segregation. Our results open the door for a new research framework to study and to understand cities, and societies, by means of computational tools (i.e. machine learning approaches) and novel sources of data able to describe human life with an unprecedented breath, scale and depth.
In this talk I will present a framework we have developed that integrates novel modeling techniques with nontraditional data sources to identify the source of emerging outbreaks of foodborne disease. Approaching this problem requires (i) modeling the network structure of the aggregated food supply system and (ii) developing network-theoretic methods to solve the food vector and contamination location source identication problems. I will discuss our approach to both parts of this problem, experiences implementing these methods at Germany's federal-level food regulatory agency, and a developing project to extend this work to the US context.
First, I will introduce our approach to model the network structure of the aggregated food supply system utilizing publicly available statistical data and methods from transport demand modeling [1]. Then I will review our network epidemiological approach to identify the food and location source of an outbreak given the food supply network model and reported locations of illness [2,3]. To solve the source location problem we formulate a probabilistic model of the contamination diusion process and derive the maximum likelihood estimator for the source location. We use the location source estimator as the basis of an information theoretic approach to identify the food vector source carrying the contamination. A statistical test is developed to identify the food item network that best ts the observed distribution of illness data.
Case studies in on several recent outbreaks in Germany suggest that the application of the combined network models and inference methods could have substantial benets for investigators during the onset of outbreaks of foodborne disease. Beyond foodborne disease, we are applying these methods to identify the source of spread in network-based diusion processes more generally, including disease spread through global transport networks and bacterial contaminations spread through water distribution networks.
Embedding models are used in production for Google Search, in the Discover Weekly recommendation system at Spotify, and for learning representations of biological systems like genes and proteins. In this work, we develop an embedding model for foods based on patterns in a large recipe dataset. A recommendation system for food is built based on the embedding model, and we show that our model learns concepts such as which foods are complementary or which foods can be substituted for each other in recipes. The code and data are open source and readily extendable to new kinds of data.
Social relationships characterize the interactions that occur within social species and may have an important impact on collective animal motion. Here, we consider some variations of the standard Vicsek model for collective motion to incorporate social influence. The main assumption of the Vicsek and other similar models of collective motion is that particles tend to orient their velocity parallel to the average velocity in a local neighborhood, independently of their identity, leaving aside the fact that real interactions between moving animals can be more intricate. By incorporating interactions mediated by an empirically motivated scale-free topology that represents a heterogeneous pattern of social contacts, we observe that the degree of order of the model is strongly affected by network heterogeneity: more heterogeneous networks show a more resilient ordered state; while less heterogeneity leads to a more fragile ordered state that can be destroyed by sufficient external noise.
Another important aspect of collective animal motion is the existence of behavioral changes at the individual level, which may by transmitted to the group, triggering intermittent collective rearrangements or even phase transitions at the macroscopic level. We examine avalanching behavior in the collective motion of flocks where a single individual has a long range orientational contagion effect over the rest of individuals. We observe that the response of the flock to changes in the direction of motion of such individuals shows an intermittent avalanche-like behavior, characterized by sudden reorientations of the trajectories of groups of individuals. We show that the distribution of avalanche sizes and durations show scale-free signatures in analogy with self-organized critical processes. The results obtained appear to be in fairly good agreement with recent experimental results characterizing collective evasion in schooling fish. Yet, more empirical data are needed to obtain a better understanding of the patterns of collective rearrangements in other flocking systems, where individual differences and/or social interactions may have an important effect.
In this talk, I focus on representation of semantic knowledge -- word meanings and their relations -- which is an important aspect of child language learning and AI systems: it impacts how word meanings are stored in, searched for, and retrieved from memory. First, I talk about how humans learn and represent semantic knowledge. I show that, using the evolving knowledge of word relations and their contexts, we can grow a network that exhibits the properties of adult semantic knowledge. Moreover, this can be achieved using limited computation. Next, I explain how investigating human semantic processing helps us model semantic representations more accurately. I show that recent neural models of semantics, despite being trained on huge amount of data, fail at capturing important aspects of human similarity judgements. I also show that a probabilistic topic model does not have these problems, suggesting that exploring different representations may be necessary to capture different aspects of human semantic processing.
In this presentation, I will discuss two promising approaches synthesizing the macroscopic organization of real complex networks into a set of local properties, which in turn naturally define random graph ensembles reproducing the said macroscopic features based on local connection rules only. I will then discuss how the various tools developed to unveil this effective structure of networks can be used to shed light on new phenomena in Epidemiology and Neuroscience. This will be illustrated via ongoing projects dealing with the current threat of a Zika epidemic and the organization of the connectome across species.
This is joint work with Samuel F. Way, Allison C. Morgan, Roberta Sinatra and Daniel B. Larremore.
This is joint work with Mark Newman.
A challenging and increasingly important type of data is networks of entities and their relationships. Networks have been are widely used across diverse disciplines to reason about complex behavior. These analyses involve understanding relationships, as well as associated attributes, statistics, or groupings. The omnipresent node-link visualization excels at showing topology and features simultaneously, but many are difficult to extract meaning from due to poor layout or shoehorning inherent complexity into limited space. The first part of my talk will detail techniques for measuring the readability of node-link visualizations and strategies to help users create more effective and understandable visualizations.
Moreover, analyses of complex data often requires several sessions, and when returning later it can be difficult to recall the steps in your workflow. Data science in many domains is also highly collaborative. Multiple analysts may be working alongside stakeholders with varying expertise and time constraints. The second part of my talk addresses these needs, and I introduce visualization strategies that assist in making analysis workflows repeatable, free of errors, understandable, and easily shareable.
During this talk, I will present a novel mathematical framework for the modeling of highly time-varying networks and processes evolving on their fabric. In particular, I will focus on epidemic spreading, random walks, and social contagion processes on temporal networks.