Python Crypto API Misuses in the Wild: Design and Implementation of LICMA

cover
6 May 2024

Authors:

(1) Anna-Katharina Wickert, Technische Universität Darmstadt, Darmstadt, Germany (wickert@cs.tu-darmstadt.de);

(2) Lars Baumgärtner, Technische Universität Darmstadt, Darmstadt, Germany (baumgaertner@cs.tu-darmstadt.de);

(3) Florian Breitfelder, Technische Universität Darmstadt, Darmstadt, Germany (florian.breitfelder@tu-darmstadt.de);

(4) Mira Mezini, Technische Universität Darmstadt, Darmstadt, Germany (mezini@cs.tu-darmstadt.de).

Abstract and 1 Introduction

2 Background

3 Design and Implementation of Licma and 3.1 Design

3.2 Implementation

4 Methodology and 4.1 Searching and Downloading Python Apps

4.2 Comparison with Previous Studies

5 Evaluation and 5.1 GitHub Python Projects

5.2 MicroPython

6 Comparison with previous studies

7 Threats to Validity

8 Related Work

9 Conclusion, Acknowledgments, and References

3 DESIGN AND IMPLEMENTATION OF LICMA

In this section, we describe the design of our static analysis tool LICMA, and discuss the implementation in more detail.

3.1 Design

A general overview of LICMA is given in Figure 1. First, we parse a source code file into the respective Abstract Syntax Tree (AST). More specifically, we use Babelfish [2] to create a Universal Abstract Syntax Tree (UAST) which combines language-independent AST elements with language-specific elements. For simplicity, we use the term AST in the remainder of the paper. Second, we apply the LICMA analysis upon the AST to identify potential misuses of our crypto rules.

Based upon the rule defining a misuse, the analysis checks for a violation of the rule within the source code and triggers a backward analysis for this task. The backward slice is created by filtering the AST with the help of XPath[3] queries, and works as follows: First, the backward slicing algorithm (BSA) identifies all source code lines that are referred within the respective rule. An example of the slicing criterion is a function call parameter like the key for a crypto function. Second, the BSA determines for all function calls if the parameter is either hard-coded, a local assignment, or a global assignment. If one of the three cases is fulfilled, the corresponding value is returned. This value is checked against a function defined in the rule, e.g., if the value is smaller than 1,000 for §5. In the negative case, the BSA looks for the caller of the function, and checks the caller’s parameters as described above. The algorithm stops if a value is returned or no further callers to analyse are available, and returns the result of the analysis. For its reports, LICMA distinguishes between a potential misuse if it can not resolve the value of interest due to missing callers, and a definite misuse if it is resolved to an insecure option.

This paper is available on arxiv under CC BY 4.0 DEED license.


[2] https://github.com/bblfsh/bblfshd

[3] https://www.w3.org/TR/xpath/