Skip to main content
Cognitive Computing Laboratory
  • Home
  • Projects
  • Enhancing Security for Modern Software Programming Cyberinfrastructure

Enhancing Security for Modern Software Programming Cyberinfrastructure

Nowadays, software has played a vital role supporting scientific communities. Unlike conventional approaches (e.g., code handbook based), modern software programming cyber infrastructure, consisting of online discussion platforms (e.g., Stack Overflow) and social coding repositories (e.g., Github), has offered an open-source and collaborative environment for distributed scientific communities to expedite the process of software development. Within the ecosystem, researchers and developers can reuse code snippets/libraries or adapt existing ready-to-use software to solve their own problems. Despite the apparent benefits of this new social coding paradigm, its potential security-related risks have been largely overlooked – insecure or malicious codes could be easily embedded and distributed (e.g., copied-and-pasted to generate a production software), which could severely damage the scientific credibility of CI. Therefore, there is an urgent need for developing scalable techniques and tools to automatically detect these open-source insecure or malicious codes. To address this imminent issue, our newly funded project by NSF CICI program seeks to explore innovative links between artificial intelligence and cybersecurity to enhance the security of modern software programming CI. The key components of the proposed research are three-folds: (1) a novel AI-based solution will be developed to automatically identify suspicious insecure code snippets on stack overflow based on their social coding properties; (2) a cross-platform model will first be constructed to represent the complex interplay between GitHub and Stack Overflow; deep learning techniques will then be utilized to build a predictive model for automatic detection of malicious codes on GitHub; and (3) a user-friendly tool will be developed to enhance the code security for software development.

System architecture of iTrustGH - automatic detection of malicious codes on GitHub.