Sofware Engineer, Network Specialist, and PhD Student
I am a creative mind passionate about designing and implementing innovative ideas. As software engineer I am interested in software development, software architectures, and large computer networks. As PhD Student I can combine these interest in my research, developing methodologies to model the architecture of the whole Internet and improve the security, performance, and availability of the Internet. As an Entrepreneur I work toward my own Start-up that enhances network communication of Web applications and services beyond the state-of-the-art.
The Internet is shaped by independent actors and heterogeneous deployments. With the wide adoption of Transport Layer Security (TLS), a whole ecosystem of intertwined entities emerged. Acquiring a comprehensive view allows searching for previously unknown malicious entities and providing valuable cyber-threat intelligence. Actively collected Internet-wide Domain Name System (DNS) and TLS meta-data can provide the basis for such large-scale analyses. However, in order to efficiently navigate the vast volumes of data, an effective methodology is required. This work proposes a graph model of the TLS ecosystem that utilizes the relationships between servers, domains, and certificates. A Probabilistic Threat Propagation (PTP) algorithm is then used to propagate a threat score from existing blocklists to related nodes. We conducted a one-year-long measurement study of 13 monthly active Internet-wide DNS and TLS measurements to evaluate the methodology. The latest measurement found four highly suspicious clusters among the nodes with high threat scores. External threat intelligence services were used to confirm a high rate of maliciousness in the rest of the newly found servers. With the help of optimized thresholds, we identified 557 domains and 11 IP addresses throughout the last year before they were known to be malicious. Up to 40% of the identified nodes appeared on average three months later on the input blocklist. This work proposes a versatile graph model to analyze the TLS ecosystem and a PTP analysis to help security researchers focus on suspicious subsets of the Internet when searching for unknown threats.
@inproceedings{sosnowski2024iteg,author={Sosnowski, Markus and Sattler, Patrick and Zirngibl, Johannes and Betzer, Tim and Carle, Georg},title={{Propagating Threat Scores With a TLS Ecosystem Graph Model Derived by Active Measurements}},booktitle={Proc. Network Traffic Measurement and Analysis Conference (TMA)},year={2024},month=may,doi={10.23919/TMA62044.2024.10559063},}
IEEE TNSM
EFACTLS: Effective Active TLS Fingerprinting for Large-scale Server Deployment Characterization
Markus Sosnowski, Johannes Zirngibl, Patrick Sattler, and 4 more authors
IEEE Transactions on Network and Service Management, Jun 2024
Active measurements allow the collection of server characteristics on a large scale that can aid in discovering hidden relations and commonalities among server deployments. Finding these relations opens up new possibilities for clustering and classifying server deployments; for example, identifying a previously unknown cybercriminal infrastructure can be valuable cyber-threat intelligence. In this work, we propose a methodology based on active measurements to acquire Transport Layer Security (TLS) metadata from servers and leverage it for fingerprinting. Our fingerprints capture characteristic behavior of the TLS stack, primarily influenced by the server’s implementation, configuration, and hardware support. Using an empirical optimization strategy that maximizes information gained from every handshake to minimize measurement costs, we generated 10 general-purpose Client Hellos. They served as scanning probes to create an extensive database of TLS configurations to classify servers. We propose the Shannon Entropy to measure collected information and compare different approaches. This study fingerprinted 8 million servers from the Tranco top list and two Command and Control (C2) blocklists over 60 weeks with weekly snapshots. The resulting data formed the foundation for two long-term case studies: classification of Content Delivery Network and C2 servers. Moreover, the detection was fine-grained enough to detect C2 server families. The proposed methodology demonstrated a precision of 99% and enabled a stable identification of new servers over time. This study shows how active measurements can provide valuable security-relevant insights and improve our understanding of the Internet.
@article{sosnowski2024efactls,author={Sosnowski, Markus and Zirngibl, Johannes and Sattler, Patrick and Carle, Georg and Grohnfeldt, Claas and Russo, Michele and Sgandurra, Daniele},journal={IEEE Transactions on Network and Service Management},title={{EFACTLS: Effective Active TLS Fingerprinting for Large-scale Server Deployment Characterization}},year={2024},volume={21},number={3},pages={2582-2595},keywords={Servers;Fingerprint recognition;Protocols;Behavioral sciences;Probes;Internet;Feature extraction;Active scanning;TLS;fingerprinting;server classification;command and control servers},doi={10.1109/TNSM.2024.3364526},issn={1932-4537},month=jun,}
Active measurements can be used to collect server characteristics on a large scale. This kind of metadata can help discovering hidden relations and commonalities among server deployments offering new possibilities to cluster and classify them. As an example, identifying a previously-unknown cybercriminal infrastructures can be a valuable source for cyber-threat intelligence. We propose herein an active measurement-based methodology for acquiring Transport Layer Security (TLS) metadata from servers and leverage it for their fingerprinting. Our fingerprints capture the characteristic behavior of the TLS stack primarily caused by the implementation, configuration, and hardware support of the underlying server. Using an empirical optimization strategy that maximizes information gain from every handshake to minimize measurement costs, we generated 10 general-purpose Client Hellos used as scanning probes to create a large database of TLS configurations used for classifying servers. We fingerprinted 28 million servers from the Alexa and Majestic toplists and two Command and Control (C2) blocklists over a period of 30 weeks with weekly snapshots as foundation for two long-term case studies: classification of Content Delivery Network and C2 servers. The proposed methodology shows a precision of more than 99 % and enables a stable identification of new servers over time. This study describes a new opportunity for active measurements to provide valuable insights into the Internet that can be used in security-relevant use cases.
@inproceedings{sosnowski2022tlsfingerprinting,author={Sosnowski, Markus and Zirngibl, Johannes and Sattler, Patrick and Carle, Georg and Grohnfeldt, Claas and Russo, Michele and Sgandurra, Daniele},title={{Active TLS Stack Fingerprinting: Characterizing TLS Server Deployments at Scale}},booktitle={Proc. Network Traffic Measurement and Analysis Conference (TMA)},year={2022},month=jun,}
Collecting metadata from Transport Layer Security (TLS) servers on a large scale allows to draw conclusions about their capabilities and configuration. This provides not only insights into the Internet but it enables use cases like detecting malicious Command and Control (C &C) servers. However, active scanners can only observe and interpret the behavior of TLS servers, the underlying configuration and implementation causing the behavior remains hidden. Existing approaches struggle between resource intensive scans that can reconstruct this data and light-weight fingerprinting approaches that aim to differentiate servers without making any assumptions about their inner working. With this work we propose DissecTLS, an active TLS scanner that is both light-weight enough to be used for Internet measurements and able to reconstruct the configuration and capabilities of the TLS stack. This was achieved by modeling the parameters of the TLS stack and derive an active scan that dynamically creates scanning probes based on the model and the previous responses from the server. We provide a comparison of five active TLS scanning and fingerprinting approaches in a local testbed and on toplist targets. We conducted a measurement study over nine weeks to fingerprint C &C servers and analyzed popular and deprecated TLS parameter usage. Similar to related work, the fingerprinting achieved a maximum precision of 99 % for a conservative detection threshold of 100 %; and at the same time, we improved the recall by a factor of 2.8.
@inproceedings{10.1007/978-3-031-28486-1_6,author={Sosnowski, Markus and Zirngibl, Johannes and Sattler, Patrick and Carle, Georg},editor={Brunstrom, Anna and Flores, Marcel and Fiore, Marco},title={{DissecTLS: A Scalable Active Scanner for TLS Server Configurations, Capabilities, and TLS Fingerprinting}},booktitle={Proc. Passive and Active Measurement (PAM)},year={2023},publisher={Springer Nature Switzerland},pages={110--126},isbn={978-3-031-28486-1},doi={10.1007/978-3-031-28486-1_6},}
Quantum Computers (QCs) differ radically from traditional computers and can efficiently solve mathematical problems fundamental to our current cryptographic algorithms. Although existing QCs need to accommodate more qubits to break cryptographic algorithms, the concern of "Store-Now-Decrypt-Later" (i.e., adversaries store encrypted data today and decrypt them once powerful QCs become available) highlights the necessity to adopt quantum-safe approaches as soon as possible. In this work, we investigate the performance impact of Post-Quantum Cryptography (PQC) on TLS 1.3. Different signature algorithms and key agreements (as proposed by the National Institute of Standards and Technology (NIST)) are examined through black- and white-box measurements to get precise handshake latencies and computational costs per participating library. We emulated loss, bandwidth, and delay to analyze constrained environments. Our results reveal that HQC and Kyber are on par with our current state-of-the-art, while Dilithium and Falcon are even faster. We observed no performance drawback from using hybrid algorithms; moreover, on higher NIST security levels, PQC outperformed any algorithm in use today. Hence, we conclude that post-quantum TLS is suitable for adoption in today’s systems.
@inproceedings{10.1145/3624354.3630585,author={Sosnowski, Markus and Wiedner, Florian and Hauser, Eric and Steger, Lion and Schoinianakis, Dimitrios and Gallenm{\"u}ller, Sebastian and Carle, Georg},title={{The Performance of Post-Quantum TLS 1.3}},booktitle={Proc. International Conference on emerging Networking EXperiments and Technologies (CoNEXT)},year={2023},address={Paris, France},month=dec,keywords={performance measurements, post-quantum cryptography},doi={10.1145/3624354.3630585},}
Web Application developers have two main options to improve the performance of their REST APIs using Content Delivery Network (CDN) caches: define a Time to Live (TTL) or actively invalidate content. However, TTL-based caching is unsuited for the dynamic data exchanged via REST APIs, and neither can speed up write requests. Performance is important, as client latency directly impacts revenue, and a system’s scalability is determined by its achievable throughput. A new type of Web proxy that acts as an information broker for the underlying data rather than working on the level of HTTP requests presents new possibilities for enhancing REST APIs. Existing Conflict-free Replicated Data Type (CRDT) semantics and standards like JSON︓API can serve as a basis for such a broker. We propose CRDT Web Caching (CWC) as a novel method for distributing application data in a network of Web proxies, enabling origins to automatically update outdated cached content and proxies to respond directly to write requests. We compared simple forwarding, TTL-based caching, invalidation-based caching, and CWC in a simulated CDN deployment. Our results show that TTL-based caching can achieve the best performance, but the long inconsistency window makes it unsuitable for dynamic REST APIs. CWC outperforms invalidation-based caching in terms of throughput and latency due to a higher cache-hit ratio, and it is the only option that can accelerate write requests. However, under high system load, increased performance may lead to higher latency for non-acceleratable requests due to the additional synchronization. CWC allows developers to significantly increase REST API performance above the current state-of-the-art.
@inproceedings{sosnowski2024crdts,author={Sosnowski, Markus and {von Seck}, Richard and Wiedner, Florian and Carle, Georg},title={{CRDT Web Caching}: Enabling Distributed Writes and Fast Cache Consistency for {REST} {APIs}},booktitle={Proc. 20th International Conference on Network and Service Management (CNSM)},address={Prague, Czech Republic},keywords={REST;CDN;CRDT},year={2024},day={28},month=oct,doi={10.23919/CNSM62983.2024.10814315},}