How the Internet works
#04 - How the internet works
If someone asks the following question in an interview:
Please describe an HTTP request in as much detail as possible
This opens up a lot of possibilities but maybe we should start something with a little bit bigger -
The internet is the 8th wonder of the world - within seconds we can connect to remote locations all over the world đź’™
When you use your browser to open your emails, what happens in the background is fascinating.
You send a request to a web server and get a response with YOUR data and not someone elses even though someone else opened their email at the same time
But how does that work?
Let's dive into the rabbit hole together
When you open your browser, type mail.google.com and press enter a lot of things happen almost instantly:
- ARP Traffic
- DNS Traffic
- TCP/UDP Traffic
- TLS Traffic
First up - ARP Traffic
Address Resolution Protocol is what your router uses to figure out who has which IP address locally - but how does that work?
Great question! Your router just asks EVERYONE
The Address Resolution Protocol links TCP/IP Layer 2 and 1 - it translates between IP address and MAC address.
One machine sends a broadcast message where they ask everyone on the network:
“Who has IP address 192.168.0.92? Tell me”
The router might also be a megaphone for you (192.168.0.13) asking:
“Who has IP Address 192.168.0.92? Tell 192.168.0.13”
Can we see that in real-life?
YES!
Fire up Wireshark (If you have not worked with Wireshark - checkout the section 03)
Select your correct network interface, double-click it, and wait for a second or two.
Then add the following filter: arp
You will see something like below:
Let's dive into it - first, the router asks everyone “Who has 192.168.0.92? Tell me 192.168.0.1”
The next step is our client telling the router - ITS MEEEE
This mapping (IP <> MAC address) is stored in the router memory for future reference and the communication can start.
Why is this important?
Usually, your computer is not directly connected to the internet but there is some device (router/modem) in the middle.
That means you have a local network.
When you go out to the internet you need an address, these are called Internet Protocol Address or IP address.
Wait why do we need one of those again?
BUT… we don’t usually use IP addresses in our browsers, do we now?!
No, we use website urls, domain names, but those are not the “addresses” that servers understand.
They speak IP addresses.
How do we translate domain names into the needed IP addresses?
Great question - this is done via DNS - Domain Name Server.
The domain name resolution is the translation of IP addresses <> domain names.
When you for example type https://academy.maikroservice.com into the address bar of your browser…
It needs to be translated into an IP address for the server that holds the website you are trying to access
This is done by a something called DNS nameserver
There are public ones from e.g. cloudflare (1.1.1.1) and google (8.8.8.8)
and private ones (e.g. your router might have DNS nameserver capabilities, or your Domain Controller)
But what does the process look like?
Great question - it's one step after the next - just like on a ladder - the DNS Library Ladder.
When you type the url in your browser, the first thing that happens is that the browser cache (1) is checked if the domain exists in it (a cache is a temporary copy of information).
If not, then your local computer cache is checked. (2)
IF that one also does not have the right data your local DNS server (e.g. powerdns) is next in line (3)
Well… that one also does not have the domain saved, what now?
Now comes the time when your internet service provider DNS cache might be searched, this one is typically a recursive DNS nameserver (4).
Usually, the recursive nameserver does have the correct information for us 🎉
Can we actually see DNS traffic?
Sure - fire up Wireshark again and a terminal as well
Now use
dig https://academy.maikroservice.com +short
and watch the magic happen
You will see two lines appear in wireshark once you use the command above - one DNS query and one DNS response
they are linked as you can see in the first column -
The dig command basically asks for an A record for the subdomain and A records are IPv4 address “pointers”
They point a domain to an IPv4 address, while AAAA records point to an IPv6 address.
Pheeewww thats a lot of knowledge, I think I need to digest all of that for now.
Smart choice - We are about half-way lets continue tomorrow with TCP/UDP & TLS
1 comments