2023 IDEAL Get Ready for Research Workshop

Friday – Sunday, June 2-4, 2023

Synopsis

IDEAL will be hosting a 3-day “Get Ready for Research” workshop for undergraduate students from around the country. This workshop, aimed at 1st and 2nd year undergraduates, will introduce students to the practicalities of research, current research areas in mathematics, computer science, electrical engineering, and statistics, and help prepare them to apply for and thrive in REU programs in future years.

At the workshop participants will learn the following:

- What is “Data Science” and how is it related to what you learn in math, CS, statistics, or EE courses?
- What does current research in Data Science look like?
- How do researchers choose problems to work on?
- What does the daily schedule of a researcher look like?
- How do undergraduates get involved in research?
- What is a “Research Experience for Undergraduates” (REU), how can one apply, and how can one thrive in an REU?
- What kind of research do PhD students in these fields do?

There will be introductory lectures to several topics in data science, following the themes of the IDEAL Insitute for Data, Econometrics, Algorithms, and Learning. There will be roundtable discussions with faculty, postdocs, and graduate students, and there will be informal discussions and breaks in which participants can meet their peers and researchers in these fields.

No particular background is assumed, beyond having taken 1st and 2nd year classes in one of the fields represented (math, CS, stats, EE). The program is free to participate in and students’ travel expenses will be covered.

Saturday Panel

Logistics

Date: Friday – Sunday, June 2-4, 2023
Location: Chicago, IL at the University of Illinois Chicago
The workshop is open to undergraduates at universities and colleges in the United States. The workshop is designed to be most useful to students finishing their first or second year of college, and preference will be given to those students, but all undergraduates are welcome to apply.
Registration: To submit an application, please fill out this form here: https://forms.gle/yXb3Dggvx2Tw7B9s8
For full consideration, please submit an application by February 10, 2023. Applications received after that will be considered as space permits.
Participants’ travel and accommodation costs will be covered, either by reimbursement or by direct booking of hotel rooms. Please indicate on the application form if you plan to travel to the workshop from outside the Chicago area.
Contact: please email IDEALGetReady@gmail.com with any questions about the workshop.

Schedule

Friday

4:30pm Registration, welcoming remarks

5:00pm What is Research? (Samir Khuller)

5:30pm Research Topic 1: Community Detection on Geometric Random Graphs (Xiaochun Niu)

5:45pm Research Topic 2: Approximating the Prime-Counting Function (Tian Wang)

6:00pm Research Topic 3: Causal Inference for Policy Evaluation in Network Data (Shishir Adhikari)

6:15pm Research Highlight: Strategic Classification (Saba Ahmadi)

6:30pm Pizza/ group formation

Saturday

9:00am Breakfast

9:30am Panel: Experience in Research (Shishir Adhikari, Amil Dravid, Duan Tu, Xiaochun Niu)

10:15am Small Groups

11:15am Research Overview: Learning in Networks (Miklos Racz)

12:00pm Lunch

12:45pm Research Overview: Learning Collective Behavior in Networks (Ming Zhong)

1:30pm Small groups

3:00pm Break

3:15pm Research Overview: Are Auctions Good? (Jason Hartline)

4:00pm Research Highlight: Scalable Graph Algorithms: From Theory to Practice (Quanquan Liu)

4:15pm Break

4:30pm Research Overview: Online Algorithms for Data Analysis (Aditya Bhaskara)

5:15pm Dinner on your own

Sunday

9:00am Breakfast

9:30am Small groups

11:00am How to Find and Apply for Research Opportunities (Nick Christo)

11:30am Panel: Past REU Students and Where They Are Now (Jingling Li, Frederic Koehler, Riley Murray, Pascal Sturmfels, Yifan Wu)

12:15pm Lunch

1:00pm Presentations from small groups

2:00pm Departure

Sunday Panel

Talk Information

Research Highlights:

Saba Ahmadi: Strategic Classification

Abstract: In this talk, I will discuss learning in the presence of strategic behavior. We consider an online linear classification problem where agents arrive one by one and they wish to be classified as positive. They observe the current prediction rule and manipulate their features to get classified as positive if they can do so for a cost less than their value for being classified as positive. We show an algorithm that makes a bounded number of mistakes in presence of strategic agents for both l2and weighted l1 manipulation costs. Based on joint work with Avrim Blum, Hedyeh Beyhaghi, and Keziah Naggita.

Bio: Saba Ahmadi is a postdoc at TTIC hosted by Prof. Avrim Blum. She received her Ph.D. from University of Maryland College Park where she was advised by Prof. Samir Khuller. She is interested broadly in foundations of responsible computing, machine learning, economics and theoretical computer science.

Quanquan Liu: Scalable Graph Algorithms: From Theory to Practice

Abstract: Graph algorithms are ubiquitous in today’s world where graph analytics are performed over massive datasets containing potentially sensitive information. Modern graphs present many new challenges not considered by classic static, sequential computation models. First, graphs have up to trillions of edges and are several orders of magnitude larger than what traditional sequential algorithms can handle. In addition to scale, modern graphs are also dynamically evolving with up to millions of changes per second. Second, data leaks and commercial data trading threaten to expose the large volume of sensitive private information contained in these graphs. Third, the monetary and resource incentives associated with large distributed graphs (e.g., for cryptocurrency) make them vulnerable to malicious adversaries. Thus, modern graph algorithms must achieve several simultaneous goals: efficiency, scalability, privacy, and robustness against adversaries. I’ll give an overview of my research which deals with each of these topics.

Bio: Quanquan C. Liu is a postdoctoral scholar at Northwestern University advised by Samir Khuller. She completed her PhD in Computer Science at MIT where she was advised by Erik D. Demaine and Julian Shun. Before that, she obtained her dual bachelor’s degree in computer science and math also at MIT. She has worked on a number of problems in algorithms and the intersection between theory and practice. Her most recent work focuses on parallel dynamic and static graph algorithms as well as differentially private graph algorithms. She has earned the Best Paper Award at SPAA 2022, a NSF Graduate Research Fellowship, and participated in the 2021 EECS Rising Stars workshop.

Topic Overviews:

Miklos Racz: Learning in Networks

Abstract: Networks play a central role in our lives, in society, and in the sciences. This talk will highlight some of the main areas of research in understanding the structure and dynamics of networks. It will touch upon recovering communities in networks, the small world nature of networks and their navigability, and dynamics such as viral cascades.

Bio: Miklos Z. Racz is an assistant professor at Northwestern University, jointly in the Department of Statistics and Data Science and the Department of Computer Science. Before joining Northwestern, he received his PhD in Statistics from UC Berkeley, he was a postdoc in the Theory Group at Microsoft Research, Redmond, and was then an assistant professor at Princeton University. Miki’s research lies at the interface of probability, statistics, computer science, and information theory. Miki’s research and teaching has been recognized by Princeton’s Howard B. Wentz, Jr. Junior Faculty Award, a Princeton SEAS Innovation Award, and an Excellence in Teaching Award.

Ming Zhong: Learning Collective Behavior in Networks

Abstract: Collective behaviors, such as clustering, flocking, swarming, and synchronization, appear in many branches of scientific studies. The study of collective behaviors focuses on the emergence of order from complex interacting agent systems. It is a challenging task to provide an accurate and concise mathematical understanding. We offer a family of machine learning methods to understand such behaviors from observation data. Our methods can be used to improve the modeling of collective behaviors. We will discuss the existing aspects on how to learn collective behaviors from data in this talk.

Bio: Dr. Ming Zhong is currently an assistant professor in applied mathematics at Illinois Institute of Technology. Before his appointment at Illinois Tech, he was hired as a data scientist to build the scientific machine learning program at the Texas A&M Institute of Data Science. He did his postdoctoral research with Mauro Maggioni at Johns Hopkins on learning collective behaviors and obtained his Ph.D. in applied mathematics from University of Maryland, under the guidance of Eitan Tadmor.

Jason Hartline: Are Auctions Good?

Abstract: In this talk I will discuss the game theoretic analysis of auctions and the central question of whether or not auctions achieve good outcomes when bidders are strategic. I will introduce two analyses that allow the welfare of auctions in equilibrium to be analyzed without painstakingly solving for the equilibrium of the auction. The two analyses that are sufficient are (1) whether or not there is efficiency loss when bidders solve their best response problem and (2) whether or not there is efficiency loss in how the auction rules convert competition between bidders into auction revenue. This approach of analyzing welfare without solving for equilibrium is especially important in complex auction environments like those of Internet advertising.

Bio: Prof. Hartline’s research introduces design and analysis methodologies from computer science to understand and improve outcomes of economic and legal systems. Optimal behavior and outcomes in complex environments are complex and, therefore, should not be expected; instead, the theory of approximation can show that simple and natural behaviors are approximately optimal in complex environments. This approach is applied to auction theory and mechanism design in his graduate textbook Mechanism Design and Approximation which is under preparation.

Prof. Hartline received his Ph.D. in 2003 from the University of Washington under the supervision of Anna Karlin. He was a postdoctoral fellow at Carnegie Mellon University under the supervision of Avrim Blum; and subsequently a researcher at Microsoft Research in Silicon Valley. He joined Northwestern University in 2008 where he is a professor of computer science. He was on sabbatical at Harvard University in the Economics Department during the 2014 calendar year and visiting Microsoft Research, New England for the Spring of 2015.

Prof. Hartline is the director of Northwestern’s Online Markets Lab, he was a founding codirector of the Institute for Data, Econometrics, Algorithms, and Learning from 2019-2022, and is a cofounder of virtual conference organizing platform Virtual Chair.

Aditya Bhaskara: Online Algorithms for Data Analysis

Abstract: In many applications of learning and data analysis, inputs (data points, users, etc.) arrive online, and an algorithm must make decisions iteratively, whilst competing with an optimal offline solution — one that can be computed in hindsight after seeing the full input.

I will discuss an online variant of k-means clustering, a classic problem in data analysis. Here, points arrive one after another, and the algorithm must assign a point to a cluster as soon as it arrives (and this decision is irrevocable). The goal is to ensure that the resulting solution is comparable to the “optimal in hindsight” clustering. I will show how to do this via a “novelty sampling” approach.

Next, I will talk about recent “beyond worst case” models for online algorithms. I will specifically talk about a setting where we assume that the input sequence has some “predictability”. In some classic online decision-making settings, such an assumption can yield surprisingly better guarantees in terms of competing against the best offline solution.

Bio: Aditya Bhaskara is an Associate Professor of Computer Science at the University of Utah. His research interests lie broadly in theoretical computer science and machine learning. His recent work has focused on designing learning algorithms under resource-constrained models such as streaming and online arrivals, and studying new beyond-worst-case analytical models. He received a Ph.D. in Computer Science from Princeton University and a Bachelor of Technology in Computer Science from the Indian Institute of Technology Bombay. He is a recipient of the Google Faculty Research Award and the NSF Early Career Research Award.