[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3460120.3484763acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

The Effect of Google Search on Software Security: Unobtrusive Security Interventions via Content Re-ranking

Published: 13 November 2021 Publication History

Abstract

Google Search is where most developers start their Web journey looking for code examples to reuse. It is highly likely that code that is linked to the top results will be among those candidates that find their way into production software. However, as a large amount of secure and insecure code has been identified on the Web, the question arises how the providing webpages are ranked by Google and whether the ranking has an effect on software security. We investigate how secure and insecure cryptographic code examples from Stack Overflow are ranked by Google Search. Our results show that insecure code ends up in the top results and is clicked on more often. There is at least a 22.8% chance that one out of the top three Google Search results leads to insecure code. We introduce security-based re-ranking, where the rank of Google Search is updated based on the security and relevance of the provided source code in the results. We tested our re-ranking approach and compared it to Google's original ranking in an online developer study. Participants that used our modified search engine to look for help online submitted more secure and functional results, with statistical significance. In contrast to prior work on helping developers to write secure code, security-based re-ranking completely eradicates the requirement for any action performed by developers. Our intervention remains completely invisible, and therefore the probability of adoption is greatly increased. We believe security-based re-ranking allows Internet-wide improvement of code security and prevents the far-reaching spread of insecure code found on the Web.

References

[1]
Yasemin Acar, Michael Backes, Sascha Fahl, Simson Garfinkel, Doowon Kim, Michelle L Mazurek, and Christian Stransky. Comparing the usability of cryptographic APIs. In IEEE Symposium on Security and Privacy (S&P), pages 154--171, 2017.
[2]
Yasemin Acar, Michael Backes, Sascha Fahl, Doowon Kim, Michelle L Mazurek, and Christian Stransky. You get where you're looking for: The impact of information sources on code security. In IEEE Symposium on Security and Privacy (S&P), pages 289--305, 2016.
[3]
Yasemin Acar, Sascha Fahl, and Michelle L Mazurek. You are not your developer, either: A research agenda for usable security and privacy research beyond end users. In 2016 IEEE Cybersecurity Development (SecDev), pages 3--8, 2016.
[4]
Yasemin Acar, Christian Stransky, Dominik Wermke, Michelle Mazurek, and Sascha Fahl. Security developer studies with Github users: Exploring a convenience sample. In Thirteenth Symposium on Usable Privacy and Security (SOUPS), pages 81--95, 2017.
[5]
Ricardo Baeza-Yates. Web usage mining in search engines. In Web Mining: Applications and Techniques, pages 307--321. IGI Global, 2005.
[6]
Wei Bai, Omer Akgul, and Michelle L Mazurek. A qualitative investigation of insecure code propagation from online forums. In 2019 IEEE Cybersecurity Development (SecDev), pages 34--48. IEEE, 2019.
[7]
Hudson Borges and Marco Tulio Valente. What's in a GitHub star? Understanding repository starring practices in a social coding platform. Journal of Systems and Software, 146:112--129, 2018.
[8]
Simon Byers, Lorrie Faith Cranor, Dave Kormann, and Patrick McDaniel. Searching for privacy: Design and implementation of a P3P-enabled search engine. In International Workshop on Privacy Enhancing Technologies (PETS), pages 314--328. Springer, 2004.
[9]
Mengsu Chen, Felix Fischer, Na Meng, Xiaoyin Wang, and Jens Grossklags. How reliable is the crowdsourced knowledge of security implementation? In ACM/IEEE International Conference on Software Engineering (ICSE), pages 536--547, 2019.
[10]
Mario Christ, Ramayya Krishnan, Daniel Nagin, Robert Kraut, and Oliver Gunther. Trajectories of individual WWW usage: Implications for electronic commerce. In 34th Annual Hawaii International Conference on System Sciences (HICSS), 2001.
[11]
Anastasia Danilova, Alena Naiakshina, and Matthew Smith. One size does not fit all: A grounded theory and online survey study of developer preferences for security warning types. In ACM/IEEE International Conference on Software Engineering (ICSE), pages 136--148, 2020.
[12]
Babur De los Santos. Consumer search on the internet. International Journal of Industrial Organization, 58:66--105, 2018.
[13]
Rachna Dhamija, J Doug Tygar, and Marti Hearst. Why Phishing works. In SIGCHI Conference on Human Factors in Computing Systems, pages 581--590, 2006.
[14]
Serge Egelman, Lorrie Faith Cranor, and Jason Hong. You've been warned: An empirical study of the effectiveness of web browser Phishing warnings. In SIGCHI Conference on Human Factors in Computing Systems, pages 1065--1074, 2008.
[15]
Sascha Fahl, Marian Harbach, Thomas Muders, Lars Baumg"artner, Bernd Freisleben, and Matthew Smith. Why eve and mallory love android: an analysis of android ssl (in)security. In 2012 ACM Conference on Computer & Communications Security, CCS '12, pages 50--61, New York, NY, USA, 2012. ACM.
[16]
Adrienne Porter Felt, Robert W. Reeder, Alex Ainslie, Helen Harris, Max Walker, Christopher Thompson, Mustafa Embre Acer, Elisabeth Morant, and Sunny Consolvo. Rethinking connection security indicators. In Twelfth Symposium on Usable Privacy and Security (SOUPS), pages 1--14, 2016.
[17]
Felix Fischer, Konstantin Bö ttinger, Huang Xiao, Christian Stransky, Yasemin Acar, Michael Backes, and Sascha Fahl. Stack overflow considered harmful? The impact of copy&paste on Android application security. In IEEE Symposium on Security and Privacy (S&P), pages 121--136, 2017.
[18]
Felix Fischer, Huang Xiao, Ching-Yu Kao, Yannick Stachelscheid, Benjamin Johnson, Danial Razar, Paul Fawkesley, Nat Buckley, Konstantin Böttinger, Paul Muntean, and Jens Grossklags. Stack overflow considered helpful! Deep learning security nudges towards stronger cryptography. In 28th USENIX Security Symposium (USENIX Security), pages 339--356, 2019.
[19]
Stuart Geiger. Summary analysis of the 2017 GitHub open source survey. arXiv preprint arXiv:1706.02777, 2017.
[20]
GitHub. Open source survey, 2017. Available at: https://opensourcesurvey.org/2017/. Last accessed on: June 13, 2020.
[21]
Google. Transparency report: Google safe browsing, 2020. Available at: https://transparencyreport.google.com/safe-browsing/overview. Last accessed on: May 04, 2020.
[22]
Peter Leo Gorski, Luigi Lo Iacono, Dominik Wermke, Christian Stransky, Sebastian Möller, Yasemin Acar, and Sascha Fahl. Developers deserve security warnings, too: On the effect of integrated security advice on cryptographic API misuse. In Fourteenth Symposium on Usable Privacy and Security (SOUPS), pages 265--281, 2018.
[23]
Georgios Gousios, Andy Zaidman, Margaret-Anne Storey, and Arie Van Deursen. Work practices and challenges in pull-based development: The integrator's perspective. In IEEE/ACM International Conference on Software Engineering (ICSE), pages 358--368. IEEE, 2015.
[24]
Raman Goyal, Gabriel Ferreira, Christian K"astner, and James Herbsleb. Identifying unusual commits on GitHub. Journal of Software: Evolution and Process, 30(1):Article No. e1893, 2018.
[25]
Aniko Hannak, Piotr Sapiezynski, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan Mislove, and Christo Wilson. Measuring personalization of web search. In 22nd International Conference on World Wide Web (WWW), pages 527--538, 2013.
[26]
Amir Herzberg and Ahmad Jbara. Security and identification indicators for browsers against spoofing and Phishing attacks. ACM Transactions on Internet Technology, 8(4):1--36, 2008.
[27]
Michael Hucka and Matthew Graham. Software search is not a science, even among scientists: A survey of how scientists and engineers find software. Journal of Systems and Software, 141:171 -- 191, 2018.
[28]
Jing Jiang, David Lo, Jiahuan He, Xin Xia, Pavneet Singh Kochhar, and Li Zhang. Why and how developers fork what from whom in GitHub. Empirical Software Engineering, 22(1):547--578, 2017.
[29]
Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. Why don't software developers use static analysis tools to find bugs? In ACM/IEEE International Conference on Software Engineering (ICSE), pages 672--681, 2013.
[30]
Tyler Moore, Nektarios Leontiadis, and Nicolas Christin. Fashion crimes: Trending-term exploitation on the web. In ACM Conference on Computer and Communications Security (CCS), pages 455--466, 2011.
[31]
Duc Cuong Nguyen, Dominik Wermke, Yasemin Acar, Michael Backes, Charles Weir, and Sascha Fahl. A stitch in time: Supporting Android developers in writing secure code. In ACM Conference on Computer and Communications Security (CCS), pages 1065--1077, 2017.
[32]
Bing Pan, Helene Hembrooke, Thorsten Joachims, Lori Lorigo, Geri Gay, and Laura Granka. In Google We Trust: Users' Decisions on Rank, Position, and Relevance. Journal of Computer-Mediated Communication, 12(3):801--823, 04 2007.
[33]
Peter Peduzzi, John Concato, Elizabeth Kemper, Theodore Holford, and Alvan Feinstein. A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology, 49(12):1373--1379, 1996.
[34]
Gustavo Pinto, Igor Steinmacher, and Marco Aurélio Gerosa. More common than you think: An in-depth study of casual contributors. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pages 112--123. IEEE, 2016.
[35]
Md Masudur Rahman, Jed Barson, Sydney Paul, Joshua Kayani, Federico Andrés Lois, Sebastián Fernandez Quezada, Christopher Parnin, Kathryn T Stolee, and Baishakhi Ray. Evaluating how developers use general-purpose web-search for code retrieval. In 15th International Conference on Mining Software Repositories (MSR), pages 465--475, 2018.
[36]
Miguel Ramos, Marco Tulio Valente, Ricardo Terra, and Gustavo Santos. AngularJS in the wild: A survey with 460 developers. In 7th International Workshop on Evaluation and Usability of Programming Languages and Tools, pages 9--16, 2016.
[37]
Ayushi Rastogi. Do biases related to geographical location influence work-related decisions in GitHub? In IEEE/ACM International Conference on Software Engineering Companion (ICSE-C), pages 665--667, 2016.
[38]
Elissa Redmiles, Amelia Malone, and Michelle Mazurek. I think they're trying to tell me something: Advice sources and selection for digital security. In IEEE Symposium on Security and Privacy (S&P), pages 272--288, 2016.
[39]
Caitlin Sadowski, Kathryn Stolee, and Sebastian Elbaum. How developers search for code: A case study. In 10th Joint Meeting on Foundations of Software Engineering, pages 191--201, 2015.
[40]
Yusuke Saito, Kenji Fujiwara, Hiroshi Igaki, Norihiro Yoshida, and Hajimu Iida. How do GitHub users feel with pull-based development? In 2016 7th International Workshop on Empirical Software Engineering in Practice (IWESEP), pages 7--11. IEEE, 2016.
[41]
Susan Elliott Sim, Medha Umarji, Sukanya Ratanotayanon, and Cristina V Lopes. How well do search engines support code retrieval on the web? ACM Transactions on Software Engineering and Methodology (TOSEM), 21(1):1--25, 2011.
[42]
Stack Overflow. Developer survey results, 2019. Available at: https://insights.stackoverflow.com/survey/2019. Last accessed on: May 04, 2020.
[43]
Christian Stransky, Yasemin Acar, Duc Cuong Nguyen, Dominik Wermke, Doowon Kim, Elissa Redmiles, Michael Backes, Simson Garfinkel, Michelle L Mazurek, and Sascha Fahl. Lessons learned from using an online platform to conduct large-scale, online controlled security experiments with software developers. In 10th USENIX Workshop on Cyber Security Experimentation and Test (CSET), 2017.
[44]
D. van der Linden, E. Williams, J. Hallett, and A. Rashid. The impact of surface features on choice of (in)secure answers by Stackoverflow readers. IEEE Transactions on Software Engineering, forthcoming.
[45]
Gaurav Varshney, Manoj Misra, and Pradeep Atrey. A Phish detector using lightweight search features. Computers & Security, 62:213--228, 2016.
[46]
Bogdan Vasilescu, Kelly Blincoe, Qi Xuan, Casey Casalnuovo, Daniela Damian, Premkumar Devanbu, and Vladimir Filkov. The sky is not the limit: Multitasking across GitHub projects. In 38th International Conference on Software Engineering, pages 994--1005, 2016.
[47]
Bogdan Vasilescu, Vladimir Filkov, and Alexander Serebrenik. Perceptions of diversity on GitHub: A user survey. In 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering, pages 50--56, 2015.
[48]
Bogdan Vasilescu, Daryl Posnett, Baishakhi Ray, Mark van den Brand, Alexander Serebrenik, Premkumar Devanbu, and Vladimir Filkov. Gender and tenure diversity in GitHub teams. In SIGCHI Conference on Human Factors in Computing Systems, pages 3789--3798, 2015.
[49]
Eric Vittinghoff and Charles E McCulloch. Relaxing the rule of ten events per variable in logistic and cox regression. American journal of epidemiology, 165(6):710--718, 2007.
[50]
Rick Wash and Emilee Rader. Too much knowledge? Security beliefs and protective behaviors among United States internet users. In Eleventh Symposium on Usable Privacy and Security (SOUPS), pages 309--325, 2015.
[51]
Jim Witschey, Shundan Xiao, and Emerson Murphy-Hill. Technical and personal factors influencing developers' adoption of security tools. In Proceedings of the 2014 ACM Workshop on Security Information Workers (SIW), pages 23--26, 2014.
[52]
Yu Wu, Jessica Kropczynski, Patrick Shih, and John Carroll. Exploring the ecosystem of software developers on GitHub and other platforms. In Companion Publication of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing, pages 265--268, 2014.
[53]
Xin Xia, Lingfeng Bao, David Lo, Pavneet Singh Kochhar, Ahmed Hassan, and Zhenchang Xing. What do developers search for on the web? Empirical Software Engineering, 22(6):3149--3185, 2017.
[54]
Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. Neural network-based graph embedding for cross-platform binary code similarity detection. In ACM Conference on Computer & Communications Security (CCS), pages 363--376, 2017.
[55]
Yanfang Ye, Shifu Hou, Lingwei Chen, Xin Li, Liang Zhao, Shouhuai Xu, Jiabin Wang, and Qi Xiong. ICSD: An automatic system for insecure code snippet detection in Stack Overflow over heterogeneous information network. In 34th Annual Computer Security Applications Conference (ACSAC), pages 542--552, 2018.

Cited By

View all
  • (2024)Using AI Assistants in Software Development: A Qualitative Study on Security Practices and ConcernsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690283(2726-2740)Online publication date: 2-Dec-2024
  • (2024)Balancing Act: Boosting Strategies for Informed Search on Controversial TopicsProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638329(254-265)Online publication date: 10-Mar-2024
  • (2023)SoKProceedings of the Nineteenth USENIX Conference on Usable Privacy and Security10.5555/3632186.3632205(341-359)Online publication date: 7-Aug-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '21: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security
November 2021
3558 pages
ISBN:9781450384544
DOI:10.1145/3460120
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. content ranking
  2. software development
  3. usable security
  4. web search

Qualifiers

  • Research-article

Conference

CCS '21
Sponsor:
CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security
November 15 - 19, 2021
Virtual Event, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)262
  • Downloads (Last 6 weeks)32
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Using AI Assistants in Software Development: A Qualitative Study on Security Practices and ConcernsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690283(2726-2740)Online publication date: 2-Dec-2024
  • (2024)Balancing Act: Boosting Strategies for Informed Search on Controversial TopicsProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638329(254-265)Online publication date: 10-Mar-2024
  • (2023)SoKProceedings of the Nineteenth USENIX Conference on Usable Privacy and Security10.5555/3632186.3632205(341-359)Online publication date: 7-Aug-2023
  • (2023)The Effectiveness of Security Interventions on GitHubProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623174(2426-2440)Online publication date: 15-Nov-2023
  • (2023)"Get in Researchers; We're Measuring Reproducibility": A Reproducibility Study of Machine Learning Papers in Tier 1 Security ConferencesProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623130(3433-3459)Online publication date: 15-Nov-2023
  • (2023)"Make Them Change it Every Week!": A Qualitative Exploration of Online Developer Advice on Usable and Secure AuthenticationProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623072(2740-2754)Online publication date: 15-Nov-2023
  • (2023)Studying Secure Coding in the Laboratory: Why, What, Where, How, and Who?2023 IEEE/ACM 4th International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS)10.1109/EnCyCriS59249.2023.00008(23-30)Online publication date: May-2023
  • (2023)Simple stupid insecure practices and GitHub’s code searchJournal of Systems and Software10.1016/j.jss.2023.111698202:COnline publication date: 1-Aug-2023
  • (2023)Is googling risky? A study on risk perception and experiences of adverse consequences in web searchJournal of the Association for Information Science and Technology10.1002/asi.2480275:5(567-580)Online publication date: 6-Jun-2023
  • (2022)Nudging Software Developers Toward Secure CodeIEEE Security & Privacy10.1109/MSEC.2022.314233720:2(76-79)Online publication date: Mar-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media