Software Watermarking

Software watermarking means hiding selected features in code, in order to identify it or prove its authenticity. This is useful for fighting software piracy, but also for checking the correct distribution of open-source software (like for instance projects under the GNU license). The previously proposed methods assume that the watermark can be introduced at the time of software development, and require the understanding and input of the author for the embedding process. The goal of our research is the development of a watermarking framework that automates this process by introducing the watermark during the compilation phase into newly developed or even into legacy code. As a first approach we studied a method that is based on symbolic execution and function synthesis.

In 2018, two bachelor theses analyzed two methods of symbolic execution and function synthesis in order to determine the most appropriate one for our approach. In 2019, we investigated the idea to use concolic execution in the context of the LLVM compiler infrastructure in order to hide a watermark in an unused register. Using a modified register allocation, one register can be reserved for storing the watermark. In 2020, we extended the framework (now called LLWM) for automatically embedding software watermarks into source code (based on the LLVM compiler infrastructure) with further dynamic methods. The newly introduced methods rely on replacing/hiding jump targets and on call graph modifications. In 2021, we added other adapted, dynamic methods that have already been published, as well as a newly developed method to LLWM. The added methods are based, among other things, on the conversion of conditional constructs into semantically equivalent loops or on the integration of hash functions, that leave the functionality of the program unchanged but increase its resilience. Our newly developed method IR-Mark now not only specifically selects the functions in which the code generator avoids using a certain register. IR-Mark now adds some dynamic computation of fake values that makes use of this register to blurr what is going on. There is a publication on both LLWM and IR-Mark. In 2022, we added another adapted procedure to the LLWM framework. The method uses exception handling to hide the watermark. In 2023, we adapted more methods to expand the LLWM framework. These include embedding techniques based on principles of number theory and aliasing.

In 2024, we developed three new watermarking techniques: Register Expansion, SemaCall, and SideData.
They construct hash-like arithmetics that generate a watermarking message from a secret key. The first two techniques have been published in the paper "Register Expansion and SemaCall: 2 Low-overhead Dynamic Watermarks Suitable for Automation in LLVM" in the proceedings of the CheckMATE'24 workshop in Salt Lake City. We wrote an extended version containing the SideData watermark, currently under peer review for the DTRAP journal.

Publikationen

Novac D., Eichler C., Philippsen M.:
LLWM & IR-Mark: Integrating Software Watermarks into an LLVM-based Framework
ACM SIGSAC Conference on Computer and Communications Security (CCS'21), Workshop on Offensive and Defensive Techniques in the Context of Man At The End (MATE) Attacks (Checkmate ’21) (Republic of Korea, 19.11.2021 - 19.11.2021)
In: Checkmate '21: Proceedings of the 2021 Research on offensive and defensive techniques in the Context of Man At The End (MATE) Attacks, New York: 2021
DOI: 10.1145/3465413.3488576
BibTeX: Download
Schwarzbeck D., Novac D., Philippsen M.:
Register Expansion and SemaCall: 2 Low-overhead Dynamic Watermarks Suitable for Automation in LLVM
ACM SIGSAC Conference on Computer and Communications Security (CCS'24), Workshop on Offensive and Defensive Techniques in the Context of Man At The End (MATE) attacks (Checkmate ’24) (Salt Lake City, UT, 18.10.2024 - 18.10.2024)
In: CheckMATE '24: Proceedings of the 2024 Research on offensive and defensive techniques in the context of Man At The End (MATE) attacks, New York: 2024
DOI: 10.1145/3689934.3690815
URL: https://dl.acm.org/doi/10.1145/3689934.3690815#
BibTeX: Download
Schwarzbeck D., Schuh J., Hammrich M., Philippsen M., Novac D.:
Register Expansion and SemaCall: 2 low-overhead dynamic Watermarks suitable for Automation in LLVM [Source code and Raw Experiment data]
(2024)
DOI: 10.5281/zenodo.13337275
BibTeX: Download
Schwarzbeck D.:
Erweiterung eines Rahmenprogramms für das automatische Einfügen von Software- Wasserzeichen in Quellcode (Master thesis, 2024)
URL: https://github.com/FAU-Inf2/LLWM/blob/main/semacall + register-expansion + sidedata/thesis-1.pdf
BibTeX: Download
Schwarzbeck D., Schuh J., Hammrich M., Philippsen M., Novac D.:
Register Expansion, SemaCall, and SideData: 3 Low-overhead Dynamic Watermarks Suitable for Automation in LLVM [Source code and Raw Experiment data]
(2024)
DOI: 10.5281/zenodo.14234819
BibTeX: Download

SoftWater

Student contact hour

Software Watermarking

Publikationen