Pages

Sunday, December 31, 2023

Use Hostname to deduce running services

Disclaimer : the information in this article have been disclosed to my current company's Patent Committee in December 2023, but they took the decision not to pursue further with it, nor to keep it as Trade Secret, so this idea remains as a simple idea.
Therefore, I will simply disclose it here, I thought it was a good idea, maybe some people can see some interesting use case of it.
Also, as of today, I am not aware of any product / product feature that is using the idea.

So to understand the idea, let's start with some basics.

We talk about 'semantic' when we are talking about the meaning of a word.

In Computing and more specifically in Networking, this goes up to the bit level : 0 / 1 are distinct value, and they have a meaning (is / is no; true / false). So a bit alone may have a meaning, as a set of bits together.

For instance, the flags in the TCP header.

  TCP Header Format

                                    
    0                   1                   2                   3   
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Source Port          |       Destination Port        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Acknowledgment Number                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Data |           |U|A|P|R|S|F|                               |
   | Offset| Reserved  |R|C|S|S|Y|I|            Window             |
   |       |           |G|K|H|T|N|N|                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Checksum            |         Urgent Pointer        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Options                    |    Padding    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                             data                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                            TCP Header Format

          Note that one tick mark represents one bit position.

                               Figure 3. (from RFC 793)
For instance, the flag SYN when set to 1, mean the segment is the first segment sent by the host, and more important the sequence number in the header is the starting sequence number.
Another illustration of this semantic: the MAC address, and more specifically the first half of it. Those 3 bytes identify a specific vendor.
Last example I want to discuss in this post, is the IP address. You may have heard of the IP addressing, and with it, you may have some rules : for instance, in a network range, the first available address will be the gateway IP address, the 2nd one is for the active gateway (as a device), and the 3rd one is for the passive gateway (again as a device).
Basically:
x.x.x.1 for the default gateway
x.x.x.2 for the active node
x.x.x.3 for the passive node
All the 3 points are examples of semantic in Networking.
Now, there is a feature on PAN-OS which is allowing users to have some policies based on the IP Semantic : IP Wildcard Objects
The wildcard objects will match IPs meeting the wildcard object condition.
The idea I propose is about the hostname. All objects are defined with a hostname (simpler than to remember the IP address of every machine running on the network). Most of the time, the hostmame is defined following a naming convention, so if you can determine the naming convention or at least identify some key portion in the hostname to link to a running service, that can be saved for other purposes.
So by analysing a firewall configuration:
- we can get some mapping of hostname <-> security rules, for instance you have a security rule to allow dns request to the object "fr-dns-1".
- and by doing for a lot of configuration files, you can then get some trends on hostname portion <-> applications. For instance, when you collect 1000 security rules for DNS traffic to different hostnames, it will appear some characters may be common in most of the hostnames (for instance "dns" in a hostname may indicate the server runs DNS server).
So when some trends (hosntame part which are common for most of the security rule for the same application)  are identified for every application, it is then possible to have consumer services which could benefit from it:
- An AI-Copilot for configuration assistance ("Make a security rule to strictly allow only DNS application to the DNS servers")
- configuration audit, making sure that for all the hostnames found in a configuration, only the required applications are allowed.
- intelligence capabilities, if you collect all the A records, the applications running for all records can be returned. 

No comments: