Z. He, Y. Song, S. Zhou, and Z. Cai. Interaction of Thoughts: Towards Mediating Task Assignment in Human-AI Cooperation with a Capability-Aware Shared Mental Model. The ACM CHI Conference on Human Factors in Computing Systems (CHI). 2023.
The existing work on task assignment of human-AI cooperation did not consider the diferences between individual team members regarding their capabilities, leading to sub-optimal task completion results. In this work, we propose a capability-aware shared mental model (CASMM) with the components of task grouping and negotiation, which utilize tuples to break down tasks into sets of scenarios relating to difculties and then dynamically merge the task grouping ideas raised by human and AI through negotiation. We implement a prototype system and a 3-phase user study for the proof of concept via an image labeling task. The result shows building CASMM boosts the accuracy and time efciency signifcantly through forming the task assignment close to real capabilities within few iterations. It helps users better understand the capability of AI and themselves. Our method has the potential to generalize to other scenarios such as medical diagnoses and automatic driving in facilitating better human-AI cooperation.
A. Bhat, A. Coursey, G. Hu, S. Li, N. Nahar, S. Zhou, C. Kästner, and J. Guo. Aspirations and Practice of ML Model Documentation: Moving the Needle with Nudging and Traceability. The ACM CHI Conference on Human Factors in Computing Systems (CHI). 2023.
The documentation practice for machine-learned (ML) models often falls short of established practices for traditional software, which impedes model accountability and inadvertently abets inappropriate or misuse of models. Recently, model cards, a proposal for model documentation, have attracted notable attention, but their impact on the actual practice is unclear. In this work, we systematically study the model documentation in the field and investigate how to encourage more responsible and accountable documentation practice. Our analysis of publicly available model cards reveals a substantial gap between the proposal and the practice. We then design a tool named DocML aiming to (1) nudge the data scientists to comply with the model cards proposal during the model development, especially the sections related to ethics, and (2) assess and manage the documentation quality. A lab study reveals the benefit of our tool towards long-term documentation quality and accountability.
K. Cheng, S. Zhou. , and A. Olechowski. In the age of collaboration, the Computer-Aided Design ecosystem is behind: Evidence from an interview study of distributed CAD practice. The 26th ACM Conference On Computer-Supported Cooperative Work And Social Computing (CSCW). 2023.
Computer-aided design (CAD) has become indispensable to increasingly collaborative hardware design processes. Despite the long-standing and growing need for collaboration with CAD models and tools, anecdotal reports and ongoing researcher efforts point to a complex and unresolved set of challenges faced by designers when working with distributed CAD. We aim to close this academic-practitioner knowledge gap through the first systematic study of professional user-driven CAD collaboration challenges. In this work, we conduct semi-structured interviews with 20 CAD professionals of diverse industries, roles, and experience levels to understand their collaborative workflows with distributed CAD tools. In total, we identify 14 challenges related to collaborative design, communication, data management, and permissioning that are currently impeding effective collaboration in professional CAD teams. Our systematic classification of CAD collaboration challenges presents a guide for pressing areas of future work, highlighting important implications for CAD researchers, practitioners, and tool builders to target new advancement in CAD infrastructure, management choices, and modelling best practices. With the insights gained from this work, we hope to ultimately improve collaboration efficiency, quality, and innovation for future product design teams.
N. Nahar, H. Zhang, G. Lewis, S. Zhou, and C. Kästner. A Meta-Summary of Challenges in Building Products with ML Components – Collecting Experiences from 4758+ Practitioners. In Proceedings of the International Conference on AI Engineering - Software Engineering for AI (CAIN). 2023.
Incorporating machine learning (ML) components into software products raises new software-engineering challenges and exacerbates existing ones. Many researchers have invested significant effort in understanding the challenges of industry practitioners working on building products with ML components, through interviews and surveys with practitioners. With the intention to aggregate and present their collective findings, we conduct a meta-summary study: We collect 50 relevant papers that together interacted with over 4758 practitioners using guidelines for systematic literature reviews. We then collected, grouped, and organized the over 500 mentions of challenges within those papers. We highlight the most commonly reported challenges and hope this meta-summary will be a useful resource for the research community to prioritize research and education in this field.
Mahsa Hadian, Scott Brisson, Bram Adams, Soude Ghari, Ehsan Noei, Marios Fokaefs, Kelly Lyons, and S. Zhou. Exploring trends and practices of forks in open-source software repositories.. In Proceedings of the 32nd Annual International Conference on Computer Science and Software Engineering (CASCON '22). IBM Corp., USA, 120–129.
Forking a software repository is a popular and recommended practice among developers. A fork is a copy of the original repository that can evolve independently from the parent repository, allowing developers to experiment with a code base or test new features without the danger of affecting the original project. A fork can result in changes that are pushed back to the original project or even evolve into an independent project. Some projects tend to be forked extensively to the point where their forks are also forked and form families of projects. In this work, we explore the motivation, the practices and the culture of forking open-source software repositories. In particular, we study how forks evolve compared to the parent repository, how they are related to pull requests, how they contribute back to the parent, and how dependencies, in terms of libraries or external modules defined in a build script, are shared or differ within project families. Finally, we relate our findings with how communication and collaboration occurs within software families.
ICSME 2022 - NIER
Y. Jiang, C. Kästner, and S. Zhou. Elevating Jupyter Notebook Maintenance Tooling by Identifying and Extracting Notebook Structures. International Conference on Software Maintenance and Evolution (ICSME 2022) - New Ideas and Emerging Results Track (NIER).
Data analysis is an exploratory, interactive, and often collaborative process. Computational notebooks have become a popular tool to support this process, among others because of their ability to interleave code, narrative text, and results. However, notebooks in practice are often criticized as hard to maintain and being of low code quality, including problems such as unused or duplicated code and out-of-order code execution. Data scientists can benefit from better tool support when maintaining and evolving notebooks. We argue that central to such tool support is identifying the structure of notebooks. We present a lightweight and accurate approach to extract notebook structure and outline several ways such structure can be used to improve maintenance tooling for notebooks, including navigation and finding alternatives.
S. Rong, W. Wang, U. Mannan, E. Almeida, S. Zhou, and I. Ahmed. An Empirical Study of Emoji Use in Software Development Communication. Information and Software Technology (IST 2022).
Context: Similar to social media platforms, people use emojis in software development related communication to enrich the context and convey additional emotion. With the increasing emoji use in software development-related communication, it has become important to understand why software developers are using emojis and their impact. Objective: Gaining a deeper understanding is essential because the intention of emoji usage might be affected by the demographics and experience of developers; also, frequency and the distribution of emoji usage might change depending on the activity, stage of the development, and nature of the conversation, etc. Methods: We present a large-scale empirical study on the intention of emoji usage conducted on 2,712 Open Source Software (OSS) projects. We build a machine learning model to automate classifying the intentions behind emoji usage in 39,980 posts. We also surveyed 60 open-source software developers from 17 countries to understand developers’ perceptions of why and when emojis are used. Results: Our results show that we can classify the intention of emoji usage with high accuracy (AUC of 0.97). In addition, the results indicate that developers use emoji for varying intentions, and emoji usage intention changes throughout a conversation. Conclusion: Our study opens a new avenue in Software Engineering research related to automatically identifying the intention of the emoji use that can help improve the communication efficiency and help project maintainers monitor and ensure the quality of communication. Another thread of future research could look into what intentions of emoji usage or what kind of emojis are more likely to attract users and how that is associated with emoji usage diffusion in different levels (threads, projects, etc.)
N. Nahar, S. Zhou, G. Lewis, and C. Kästner. Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and Process 44th International Conference on Software Engineering (ICSE 2022).
The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous evolution and monitoring, and non-traditional quality requirements such as fairness and explainability. Through interviews with 45 practitioners from 28 organizations, we identified key collaboration challenges that teams face when building and deploying ML systems into production. We report on common collaboration points in the development of production ML systems for requirements, data, and integration, as well as corresponding team patterns and challenges. We find that most of these challenges center around communication, documentation, engineering, and process and collect recommendations to address these challenges.
H. Dong, S. Zhou, J. Guo, and C. Kästner. Splitting, Renaming, Removing: A Study of Common Cleaning Activities in Jupyter Notebooks. The 8th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE), 2021.
Data scientists commonly use computational notebooks because they provide a good environment for testing multiple models. However, once the scientist completes the code and finds the ideal model, he or she will have to dedicate time to clean up the code in order for others to easily understand it. In this paper, we perform a qualitative study on how scientists clean their code in hopes of being able to suggest a tool to automate this process. Our end goal is for tool builders to address possible gaps and provide additional aid to data scientists, who then can focus more on their actual work rather than the routine and tedious cleaning work. By sampling notebooks from GitHub and analyzing changes between subsequent commits, we identified common cleaning activities, such as changes to markdown (e.g., adding headers sections or descriptions) or comments (both deleting dead code and adding descriptions) as well as reordering cells. We also find that common cleaning activities differ depending on the intended purpose of the notebook. Our results provide a valuable foundation for tool builders and notebook users, as many identified cleaning activities could benefit from codification of best practices and dedicated tool support, possibly tailored depending on intended use.
K. Constantino, S. Zhou, M. Souza, E. Figueiredo, and C. Kästner. Perceptions of Open-Source Software Developers on Collaborations: An Interview and Survey Study. Journal of Software: Evolution and Process (JSME), 2021.
With the emergence of social coding platforms, collaboration has become a key and dynamic aspect to the success of software projects. In such platforms, developers have to collaborate and deal with issues of collaboration in open-source software development. Although collaboration is challenging, collaborative development produces better software systems than any developer could produce alone. Several approaches have investigated collaboration challenges, for instance, by proposing or evaluating models and tools to support collaborative work. Despite the undeniable importance of the existing efforts in this direction, there are few works on collaboration from perspectives of developers. In this work, we aim to investigate the perceptions of open-source software developers on collaborations, such as motivations, techniques, and tools to support global, productive, and collaborative development. Following an ad hoc literature review, an exploratory interview study with 12 opensource software developers from GITHUB, our novel approach for this problem also relies on an extensive survey with 121 developers to confirm or refute the interview results. We found different collaborative contributions, such as managing change requests. Besides, we observed that most collaborators prefer to collaborate with the core team instead of their peers. We also found that most collaboration happens in software development (60%) and maintenance (47%) tasks. Furthermore, despite personal preferences to work independently, developers still consider collaborating with others in specific task categories, for instance, software development. Finally, developers also expressed the importance of the social coding platforms, such as GITHUB, to support maintainers, and contributors in making decisions and developing tasks of the projects. Therefore, these findings may help project leaders optimize the collaborations among developers and reduce entry barriers. Moreover, these findings may support the project collaborators in understanding the collaboration process and engaging others in the project.
C. Yang, S. Zhou, J. Guo, and C. Kästner. Subtle Bugs Everywhere: Generating Documentation for Data Wrangling Code. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2021.
Data scientists reportedly spend a significant amountof their time in their daily routines on data wrangling, i.e., cleaning data and extracting features. However, data wrangling code is often repetitive and error-prone to write. Moreover, itis easy to introduce subtle bugs when reusing and adopting existing code, which result not in crashes but reduce model quality. To support data scientists with data wrangling, we present a technique to generate interactive documentation for data wrangling code. We use (1) program synthesis techniques to automatically summarize data transformations and (2) test case selection techniques to purposefully select representative examples from the data based on execution information collected with tailored dynamic program analysis. We demonstrate that a JupyterLab extension with our technique can provide documentation for many cells in popular notebooks and find in a user study that users with our plugin are faster and more effective at finding realistic bugs in data wrangling code.
J. Liang, R. Ji, J. Jiang, S. Zhou, Y. Lou, Y. Xiong and G. Huang. Interactive Patch Filtering as Debugging Aid. In Proceedings of the 37th International Conference on Software Maintenance and Evolution (ICSME), 2021.
It is widely recognized that patches generated by program repair tools have to be correct to be useful. However, it is fundamentally difﬁcult to ensure the correctness of the patches. Many tools generate only the patches that are highly likely to be correct by taking conservative strategies which inevitably limit the recall of APR approaches. While the recall of APR can potentially be improved by relaxing the requirement on precision, more incorrect patches may also be generated. In this paper, we conjecture that reviewing incorrect patches also helps developers to understand the bug, and with proper tool support, reviewing incorrect patches would at least not reduce the repair performance. To evaluate this, we propose an interactive patch ﬁltering approach to facilitate developers in the patch review process via effectively ﬁltering out groups of incorrect patches. We implemented the approach as an Eclipse plugin, InPaFer, and evaluated the effectiveness and usefulness with a mixed-method evaluation. The results show that our approach improves the repair performance of developers, with 62.5% more successfully repaired bugs and 25.3% less debugging time. In particular, even if all generated patches are incorrect, the performance of developers would not be signiﬁcantly reduced, and could still be improved. Our work provides a new way of thinking for the APR research.
MSR 2020 - Mining Challenge
A. Bhattacharjee, S. Nath, S. Zhou, D. Chakroborti, B. Roy, C. Roy, and K. Schneider. An Exploratory Study to Find Motives behind Cross-platform Forks from Software Heritage Dataset. In Proceedings of the 17th International Conference on Mining Software Repositories (MSR) - Mining Challenge Track, 2020.
The fork-based development mechanism provides the flexibility and the unified processes for software teams to collaborate easily in a distributed setting without too much coordination overhead. Currently, multiple social coding platforms support fork-based development, such as GitHub, GitLab, and Bitbucket. Although these different platforms virtually share the same features, they have different emphasis. As GitHub is the most popular platform and the corresponding data is publicly available, most of the current studies are focusing on GitHub hosted projects. However, we observed anecdote evidences that people are confused about choosing among these platforms, and some projects are migrating from one platform to another, and the reasons behind these activities remain unknown. With the advances of Software Heritage Graph Dataset (SWHGD), we have the opportunity to investigate the forking activities across platforms. In this paper, we conduct an exploratory study on 10 popular open-source projects to identify cross-platform forks and investigate the motivation behind. Preliminary result shows that cross-platform forks do exist. For the 10 subject systems used in this study, we found 81,357 forks in total among which 179 forks are on GitLab. Based on our qualitative analysis, we found that most of the cross-platform forks that we identified are mirrors of the repositories on another platform, but we still find cases that were created due to preference of using certain functionalities (e.g. Continuous Integration (CI)) supported by different platforms. This study lays the foundation of future research directions, such as understanding the differences between platforms and supporting cross-platform collaboration.
K. Constantino, S. Zhou, M. Souza, E. Figueiredo, and C. Kästner. Understanding Collaborative Software Development: An Interview Study. In Proceedings of the 15th ACM/IEEE International Conference on Global Software Engineering (ICGSE), 2020.
In globally distributed software development, many software developers have to collaborate and deal with issues of collaboration. Although collaboration is challenging, collaborative development produces better software than any developer could produce alone. Unlike previous work which focuses on the proposal and evaluation of models and tools to support collaborative work, this paper presents an interview study aiming to understand (i) the motivations, (ii) how collaboration happens, and (iii) the challenge and barriers of collaborative software development. After interviewing twelve experienced software developers from GitHub, we found different types of collaborative contributions, such as in management of requests for changes. Our analysis also indicates that the main barriers for collaboration are related to non-technical, rather than technical issues.
S. Zhou, B. Vasilescu, C. Kästner. How Has Forking Changed in the Last 20 Years? A Study of Hard Forks on GitHub. In Proceedings of the 42nd International Conference on Software Engineering (ICSE), 2020. Acceptance rate: 20.9% (129/617)
The notion of forking has changed with the rise of distributed version control systems and social coding environments, like GitHub. Traditionally forking refers to splitting off an independent development branch (which we call hard forks); research on hard forks, conducted mostly in pre-GitHub days showed that hard forks were often seen critical as they may fragment a community. Today, in social coding environments, open-source developers are encouraged to fork a project in order to contribute to the community (which we call social forks), which may have also influenced perceptions and practices around hard forks. To revisit hard forks, we identify, study, and classify 15,306 hard forks on GitHub and interview 18 owners of hard forks or forked repositories. We find that, among others, hard forks often evolve out of social forks rather than being planned deliberately and that perception about hard forks have indeed changed dramatically, seeing them often as a positive noncompetitive alternative to the original project.
ASE 2019 Doctoral Symposium
S. Zhou. Improving Collaboration Efficiency in Fork-based Development. In Proceedings of the Companion of the International Conference on Automated Software Engineering (ASE), New York, NY: ACM Press, 2019.
S. Zhou, B. Vasilescu, C. Kästner. What the Fork: A Study of Inefficient and Efficient Forking Practices in Social Coding. In Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2019. Acceptance rate: 24% (74/303)
Forking and pull requests have been widely used in open-source communities as a uniform development and contribution mechanisms, which gives developers the flexibility to modify their own fork without affecting others. However, some projects observe severe inefficiencies, including lost and duplicate contributions and fragmented communities.We observed that different communities experience these inefficiencies to widely different degrees and interviewed practitioners indicate several project characteristics and practices, including modularity and coordination mechanisms, that may encourage more efficient forking practices. In this paper, we explore how open-source projects on GitHub differ with regard to forking inefficiencies. Using logistic regression models, we analyzed the association of context factors with the inefficiencies and found that better modularity and centralized management can encourage more contributions and a higher fraction of accepted pull requests, suggesting specific good practices that project maintainers can adopt to reduce forking-related inefficiencies in their community.
J. Liang, Y. Hou, S. Zhou, J. Chen, Y. Xiong, G. Huang. How to Explain a Patch: An Empirical Study of Patch Explanations in Open Source Projects. The 30th International Symposium on Software Reliability Engineering (ISSRE), Berlin, Germany, October 2019.
Bugs are inevitable in software development and maintenance processes. Recently a lot of research efforts have been devoted to automatic program repair, aiming to reduce the efforts of debugging. However, since it is difficult to ensure that the generated patches meet all quality requirements such as correctness, developers still need to review the patch. In addition, current techniques produce only patches without explanation, making it difficult for the developers to understand the patch. Therefore, we believe a more desirable approach should generate not only the patch but also an explanation of the patch. To generate a patch explanation, it is important to first understand how patches were explained. In this paper, we explored how developers explain their patches by manually analyzing 300 merged bug-fixing pull requests from six projects on GitHub. Our contribution is twofold. First, we build a patch explanation model, which summarizes the elements in a patch explanation, and corresponding expressive forms. Second, we conducted a quantitative analysis to understand the distributions of elements, and the correlation between elements and their expressive forms.
L. Ren, S. Zhou, and C. Kästner , and A. Wąsowski. Identifying Redundancies in Fork-based Development. In Proceedings of the 27th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), 2019. Acceptance rate: 27 % (40/148)
Fork-based development is popular and easy to use, but makes it difficult to maintain an overview of the whole community when the number of forks increases. This may lead to redundant development where multiple developers are solving the same problem in parallel without being aware of each other. Redundant development wastes effort for both maintainers and developers. In this paper, we designed an approach to identify redundant code changes in forks as early as possible by extracting clues indicating similarities between code changes, and building a machine learning model to predict redundancies. We evaluated the effectiveness from both the maintainer's and the developer's perspectives. The result shows that we achieve 57-83% precision for detecting duplicate code changes from maintainer's perspective, and we could save developers' effort of 1.9-3.0 commits on average. Also, we show that our approach significantly outperforms existing state-of-art.
ICSE 2018 - Poster
L. Ren, S. Zhou, and C. Kästner. Poster: Forks Insight: Providing an Overview of GitHub Forks. In Proceedings of the Companion of the International Conference on Software Engineering (ICSE), New York, NY: ACM Press, 2018. Poster.
S. Zhou, Ș. Stănciulescu, O. Leßenich, Y. Xiong, A. Wąsowski, and C. Kästner. Identifying Features in Forks. In Proceedings of the 40th International Conference on Software Engineering (ICSE), New York, NY: ACM Press, May 2018. Acceptance rate: 21 % (105/502)
Fork-based development has been widely used both in open source community and industry, because it gives developers flexibility to modify their own fork without affecting others. Unfortunately, this mechanism has downsides; when the number of forks becomes large, it is difficult for developers to get or maintain an overview of activities in the forks. Current tools provide little help. We introduced INFOX, an approach to automatically identifies not-merged features in forks and generates an overview of active forks in a project. The approach clusters cohesive code fragments using code and network analysis techniques and uses information-retrieval techniques to label clusters with keywords. The clustering is effective, with 90% accuracy on a set of known features. In addition, a human-subject evaluation shows that INFOX can provide actionable insight for developers of forks.
A. Trockman, S. Zhou, C. Kästner, and B. Vasilescu. Adding Sparkle to Social Coding: An Empirical Study of Repository Badges in the npm Ecosystem. In Proceedings of the 40th International Conference on Software Engineering (ICSE), New York, NY: ACM Press, May 2018. Acceptance rate: 21 % (105/502)
In fast-paced, reuse-heavy software development, the transparency provided by social coding platforms like GitHub is essential to decision making. Developers infer the quality of projects using visible cues, known as signals, collected from personal profile and repository pages. We report on a large-scale, mixed-methods empirical study of npm packages that explores the emerging phenomenon of repository badges, with which maintainers signal underlying qualities about the project to contributors and users. We investigate which qualities maintainers intend to signal and how well badges correlate with those qualities. After surveying developers, mining 294,941 repositories, and applying statistical modeling and time series analysis techniques, we find that non-trivial badges, which display the build status, test coverage, and up-to-dateness of dependencies, are mostly reliable signals, correlating with more tests, better pull requests, and fresher dependencies. Displaying such badges correlates with best practices, but the effects do not always persist.
S. Zhou, J. Al-Kofahi, T. Nguyen, C. Kästner, and S. Nadi. Extracting Configuration Knowledge from Build Files with Symbolic Analysis. In Proceedings of the 3rd International Workshop on Release Engineering (Releng) 2015.
Build systems contain a lot of configuration knowledge about a software system, such as under which conditions specific files are compiled. Extracting such configuration knowledge is important for many tools analyzing highly-configurable systems, but very challenging due to the complex nature of build systems. We design an approach, based on SYMake, that symbolically evaluates Makefiles and extracts configuration knowledge in terms of file presence conditions and conditional parameters. We implement an initial prototype and demonstrate feasibility on small examples.
W.Hao, S. Zhou, T. Yang, R. Zhang, and Q. Wang. 2013. Elastic resource management for heterogeneous applications on PaaS. In Proceedings of the 5th Asia-Pacific Symposium on Internetware (Internetware '13). ACM, New York, NY, USA
Elastic resource management is one of the key characteristics of cloud computing systems. Existing elastic approaches focus mainly on single resource consumption such as CPU consumption, rarely considering comprehensively various features of applications. Applications deployed on a PaaS are usually heterogeneous. While sharing the same resource, these applications are usually quite different in resource consuming. How to deploy these heterogeneous applications on the smallest size of hardware thus becomes a new research topic. In this paper, we take into consideration application's CPU consumption, I/O consumption, consumption of other server resources and application's request rate, all of which are defined as application features. This paper proposes a practical and effective elasticity approach based on the analysis of application features. The evaluation experiment shows that, compared with traditional approach, our approach can save up to 32.8% VMs without significant increase of average response time and SLA violation.